A Critical Look at Some Analyses of Major League Baseball Salaries |
| |
Authors: | David C Hoaglin Paul F Velleman |
| |
Institution: | 1. Abt Associates Inc. , 55 Wheeler St., Cambridge , MA , 02138 , USA;2. Cornell University , 358 Ives Hall, Ithaca , NY , 14853 , USA |
| |
Abstract: | At a data analysis exposition sponsored by the Section on Statistical Graphics of the ASA in 1988, 15 groups of statisticians analyzed the same data about salaries of major league baseball players. By examining what they did, what worked, and what failed, we can begin to learn about the relative strengths and weaknesses of different approaches to analyzing data. The data are rich in difficulties. They require reexpression, contain errors and outliers, and exhibit nonlinear relationships. They thus pose a realistic challenge to the variety of data analysis techniques used. The analysis groups chose a wide range of model-fitting methods, including regression, principal components, factor analysis, time series, and CART. We thus have an effective framework for comparing these approaches so that we can learn more about them. Our examination shows that approaches commonly identified with Exploratory Data Analysis are substantially more effective at revealing the underlying patterns in the data and at building parsimonious, understandable models that fit the data well. We also find that common data displays, when applied carefully, are often sufficient for even complex analyses such as this. |
| |
Keywords: | Data analysis Outliers Regression Transformation Variable selection |
|
|