A comparison of various methods for multivariate regression with highly collinear variables |
| |
Authors: | Henk A. L. Kiers Age K. Smilde |
| |
Affiliation: | (1) Heymans Institute, University of Groningen, Groningen, The Netherlands;(2) Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands |
| |
Abstract: | Regression tends to give very unstable and unreliable regression weights when predictors are highly collinear. Several methods have been proposed to counter this problem. A subset of these do so by finding components that summarize the information in the predictors and the criterion variables. The present paper compares six such methods (two of which are almost completely new) to ordinary regression: Partial least Squares (PLS), Principal Component regression (PCR), Principle covariates regression, reduced rank regression, and two variants of what is called power regression. The comparison is mainly done by means of a series of simulation studies, in which data are constructed in various ways, with different degrees of collinearity and noise, and the methods are compared in terms of their capability of recovering the population regression weights, as well as their prediction quality for the complete population. It turns out that recovery of regression weights in situations with collinearity is often very poor by all methods, unless the regression weights lie in the subspace spanning the first few principal components of the predictor variables. In those cases, typically PLS and PCR give the best recoveries of regression weights. The picture is inconclusive, however, because, especially in the study with more real life like simulated data, PLS and PCR gave the poorest recoveries of regression weights in conditions with relatively low noise and collinearity. It seems that PLS and PCR are particularly indicated in cases with much collinearity, whereas in other cases it is better to use ordinary regression. As far as prediction is concerned: Prediction suffers far less from collinearity than recovery of the regression weights. |
| |
Keywords: | Multivariate regression PLS Principal component regression Principal covariate regression Power regression Multicollinearity |
本文献已被 SpringerLink 等数据库收录! |
|