Frequency of Selecting Noise Variables in Subset Regression Analysis: A Simulation Study |
| |
Authors: | Virginia F. Flack Potter C. Chang |
| |
Affiliation: | Division of Biostatistics , School of Public Health, University of California , Los Angeles , CA , 90024 , USA |
| |
Abstract: | This article presents the results of a simulation study of variable selection in a multiple regression context that evaluates the frequency of selecting noise variables and the bias of the adjusted R 2 of the selected variables when some of the candidate variables are authentic. It is demonstrated that for most samples a large percentage of the selected variables is noise, particularly when the number of candidate variables is large relative to the number of observations. The adjusted R 2 of the selected variables is highly inflated. |
| |
Keywords: | Variable selection Exploratory regression analysis |
|
|