首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Ridge regression solves multicollinearity problems by introducing a biasing parameter that is called ridge parameter; it shrinks the estimates and their standard errors in order to reach acceptable results. Selection of the ridge parameter was done using several subjective and objective techniques that are concerned with certain criteria. In this study, selection of the ridge parameter depends on other important statistical measures to reach a better value of the ridge parameter. The proposed ridge parameter selection technique depends on a mathematical programming model and the results are evaluated using a simulation study. The performance of the proposed method is good when the error variance is greater than or equal to one; the sample consists of 20 observations, the number of explanatory variables in the model is 2, and there is a very strong correlation between the two explanatory variables.  相似文献   

2.
The use of biased estimation in data analysis and model building is discussed. A review of the theory of ridge regression and its relation to generalized inverse regression is presented along with the results of a simulation experiment and three examples of the use of ridge regression in practice. Comments on variable selection procedures, model validation, and ridge and generalized inverse regression computation procedures are included. The examples studied here show that when the predictor variables are highly correlated, ridge regression produces coefficients which predict and extrapolate better than least squares and is a safe procedure for selecting variables.  相似文献   

3.
Ridge regression has been widely applied to estimate under collinearity by defining a class of estimators that are dependent on the parameter k. The variance inflation factor (VIF) is applied to detect the presence of collinearity and also as an objective method to obtain the value of k in ridge regression. Contrarily to the definition of the VIF, the expressions traditionally applied in ridge regression do not necessarily lead to values of VIFs equal to or greater than 1. This work presents an alternative expression to calculate the VIF in ridge regression that satisfies the aforementioned condition and also presents other interesting properties.  相似文献   

4.
This study introduces fast marginal maximum likelihood (MML) algorithms for estimating the tuning (shrinkage) parameter(s) of the ridge and power ridge regression models, and an automatic plug-in MML estimator for the generalized ridge regression model, in a Bayesian framework. These methods are applicable to multicollinear or singular covariate design matrices, including matrices where the number of covariates exceeds the sample size. According to analyses of many real and simulated datasets, these MML-based ridge methods tend to compare favorably to other tuning parameter selection methods, in terms of computation speed, prediction accuracy, and ability to detect relevant covariates.  相似文献   

5.

We propose a semiparametric framework based on sliced inverse regression (SIR) to address the issue of variable selection in functional regression. SIR is an effective method for dimension reduction which computes a linear projection of the predictors in a low-dimensional space, without loss of information on the regression. In order to deal with the high dimensionality of the predictors, we consider penalized versions of SIR: ridge and sparse. We extend the approaches of variable selection developed for multidimensional SIR to select intervals that form a partition of the definition domain of the functional predictors. Selecting entire intervals rather than separated evaluation points improves the interpretability of the estimated coefficients in the functional framework. A fully automated iterative procedure is proposed to find the critical (interpretable) intervals. The approach is proved efficient on simulated and real data. The method is implemented in the R package SISIR available on CRAN at https://cran.r-project.org/package=SISIR.

  相似文献   

6.
We consider a generalization of ridge regression and demonstrate advantages over ridge regression. We provide an empirical Bayes method for determining the ridge constants, using the Bayesian interpretation of ridge estimators, and show that this coincides with a method based on a generalization of the CP statistic and the non-negative garrote. These provide an automatic variable selection procedure for the canonical variables.  相似文献   

7.
The existence of values of the ridge parameter such that ridge regression is preferable to OLS by the Pitman nearness criterion under both the quadratic and the Fisher's loss is shown. Preference regions of the two estimators under the above loss functions are found. An upper bound for the value of the Pitman's measure of closeness, independent of a deterministic or stochastic choice of the ridge parameter, is given.  相似文献   

8.
Ridge regression is the alternative method to ordinary least squares, which is mostly applied when a multiple linear regression model presents a worrying degree of collinearity. A relevant topic in ridge regression is the selection of the ridge parameter, and different proposals have been presented in the scientific literature. Since the ridge estimator is biased, its estimation is normally based on the calculation of the mean square error (MSE) without considering (to the best of our knowledge) whether the proposed value for the ridge parameter really mitigates the collinearity. With this goal and different simulations, this paper proposes to estimate the ridge parameter from the determinant of the matrix of correlation of the data, which verifies that the variance inflation factor (VIF) is lower than the traditionally established threshold. The possible relation between the VIF and the determinant of the matrix of correlation is also analysed. Finally, the contribution is illustrated with three real examples.  相似文献   

9.
The generalized cross-validation (GCV) method has been a popular technique for the selection of tuning parameters for smoothing and penalty, and has been a standard tool to select tuning parameters for shrinkage models in recent works. Its computational ease and robustness compared to the cross-validation method makes it competitive for model selection as well. It is well known that the GCV method performs well for linear estimators, which are linear functions of the response variable, such as ridge estimator. However, it may not perform well for nonlinear estimators since the GCV emphasizes linear characteristics by taking the trace of the projection matrix. This paper aims to explore the GCV for nonlinear estimators and to further extend the results to correlated data in longitudinal studies. We expect that the nonlinear GCV and quasi-GCV developed in this paper will provide similar tools for the selection of tuning parameters in linear penalty models and penalized GEE models.  相似文献   

10.
When variable selection with stepwise regression and model fitting are conducted on the same data set, competition for inclusion in the model induces a selection bias in coefficient estimators away from zero. In proportional hazards regression with right-censored data, selection bias inflates the absolute value of parameter estimate of selected parameters, while the omission of other variables may shrink coefficients toward zero. This paper explores the extent of the bias in parameter estimates from stepwise proportional hazards regression and proposes a bootstrap method, similar to those proposed by Miller (Subset Selection in Regression, 2nd edn. Chapman & Hall/CRC, 2002) for linear regression, to correct for selection bias. We also use bootstrap methods to estimate the standard error of the adjusted estimators. Simulation results show that substantial biases could be present in uncorrected stepwise estimators and, for binary covariates, could exceed 250% of the true parameter value. The simulations also show that the conditional mean of the proposed bootstrap bias-corrected parameter estimator, given that a variable is selected, is moved closer to the unconditional mean of the standard partial likelihood estimator in the chosen model, and to the population value of the parameter. We also explore the effect of the adjustment on estimates of log relative risk, given the values of the covariates in a selected model. The proposed method is illustrated with data sets in primary biliary cirrhosis and in multiple myeloma from the Eastern Cooperative Oncology Group.  相似文献   

11.
Abstract

In this article, when it is suspected that regression coefficients may be restricted to a subspace, we discuss the parameter estimation of regression coefficients in a multiple regression model. Then, in order to improve the preliminary test almost ridge estimator, we study the positive-rule Stein-type almost unbiased ridge estimator based on the positive-rule stein-type shrinkage estimator and almost unbiased ridge estimator. After that, quadratic bias and quadratic risk values of the new estimator are derived and compared with some relative estimators. And we also discuss the option of parameter k. Finally, we perform a real data example and a Monte Carlo study to illustrate theoretical results.  相似文献   

12.
In this article, we assess the local influence for the ridge regression of linear models with stochastic linear restrictions in the spirit of Cook by using the log-likelihood of the stochastic restricted ridge regression estimator. The diagnostics under the perturbations of constant variance, responses and individual explanatory variables are derived. We also assess the local influence of the stochastic restricted ridge regression estimator under the approach suggested by Billor and Loynes. At the end, a numerical example on the Longley data is given to illustrate the theoretic results.  相似文献   

13.
Variable selection is an important task in regression analysis. Performance of the statistical model highly depends on the determination of the subset of predictors. There are several methods to select most relevant variables to construct a good model. However in practice, the dependent variable may have positive continuous values and not normally distributed. In such situations, gamma distribution is more suitable than normal for building a regression model. This paper introduces an heuristic approach to perform variable selection using artificial bee colony optimization for gamma regression models. We evaluated the proposed method against with classical selection methods such as backward and stepwise. Both simulation studies and real data set examples proved the accuracy of our selection procedure.  相似文献   

14.
When the component proportions in mixture experiments are restricted by lower and upper bounds, multicollinearity appears all too frequently. Thus, we can suggest the use of ridge regression as a mean for stabilizing the coefficient estimates in the fitted model. We propose graphical methods for evaluating the effect of ridge regression estimator with respect to the predicted response value and the prediction variance.  相似文献   

15.
The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model.  相似文献   

16.
Sliced regression is an effective dimension reduction method by replacing the original high-dimensional predictors with its appropriate low-dimensional projection. It is free from any probabilistic assumption and can exhaustively estimate the central subspace. In this article, we propose to incorporate shrinkage estimation into sliced regression so that variable selection can be achieved simultaneously with dimension reduction. The new method can improve the estimation accuracy and achieve better interpretability for the reduced variables. The efficacy of proposed method is shown through both simulation and real data analysis.  相似文献   

17.
This article primarily aims to put forward the linearized restricted ridge regression (LRRR) estimator in linear regression models. Two types of LRRR estimators are investigated under the PRESS criterion and the optimal LRRR estimators and the optimal restricted generalized ridge regression estimator are obtained. We apply the results to the Hald data and finally make a simulation study by using the method of McDonald and Galarneau.  相似文献   

18.
In the linear regression model with elliptical errors, a shrinkage ridge estimator is proposed. In this regard, the restricted ridge regression estimator under sub-space restriction is improved by incorporating a general function which satisfies Taylor’s series expansion. Approximate quadratic risk function of the proposed shrinkage ridge estimator is evaluated in the elliptical regression model. A Monte Carlo simulation study and analysis based on a real data example are considered for performance analysis. It is evident from the numerical results that the shrinkage ridge estimator performs better than both unrestricted and restricted estimators in the multivariate t-regression model, for some specific cases.  相似文献   

19.
In this article, we consider the problem of variable selection in linear regression when multicollinearity is present in the data. It is well known that in the presence of multicollinearity, performance of least square (LS) estimator of regression parameters is not satisfactory. Consequently, subset selection methods, such as Mallow's Cp, which are based on LS estimates lead to selection of inadequate subsets. To overcome the problem of multicollinearity in subset selection, a new subset selection algorithm based on the ridge estimator is proposed. It is shown that the new algorithm is a better alternative to Mallow's Cp when the data exhibit multicollinearity.  相似文献   

20.
In this paper, we propose a novel Max-Relevance and Min-Common-Redundancy criterion for variable selection in linear models. Considering that the ensemble approach for variable selection has been proven to be quite effective in linear regression models, we construct a variable selection ensemble (VSE) by combining the presented stochastic correlation coefficient algorithm with a stochastic stepwise algorithm. We conduct extensive experimental comparison of our algorithm and other methods using two simulation studies and four real-life data sets. The results confirm that the proposed VSE leads to promising improvement on variable selection and regression accuracy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号