首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 437 毫秒
1.
Constrained estimators that enforce variable selection and grouping of highly correlated data have been shown to be successful in finding sparse representations and obtaining good performance in prediction. We consider polytopes as a general class of compact and convex constraint regions. Well-established procedures like LASSO (Tibshirani, 1996) or OSCAR (Bondell and Reich, 2008) are shown to be based on specific subclasses of polytopes. The general framework of polytopes can be used to investigate the geometric structure that underlies these procedures. Moreover, we propose a specifically designed class of polytopes that enforces variable selection and grouping. Simulation studies and an application illustrate the usefulness of the proposed method.  相似文献   

2.
In this article we present a robust and efficient variable selection procedure by using modal regression for varying-coefficient models with longitudinal data. The new method is proposed based on basis function approximations and a group version of the adaptive LASSO penalty, which can select significant variables and estimate the non-zero smooth coefficient functions simultaneously. Under suitable conditions, we establish the consistency in variable selection and the oracle property in estimation. A simulation study and two real data examples are undertaken to assess the finite sample performance of the proposed variable selection procedure.  相似文献   

3.
Lasso proved to be an extremely successful technique for simultaneous estimation and variable selection. However lasso has two major drawbacks. First, it does not enforce any grouping effect and secondly in some situation lasso solutions are inconsistent for variable selection. To overcome this inconsistency adaptive lasso is proposed where adaptive weights are used for penalizing different coefficients. Recently a doubly regularized technique namely elastic net is proposed which encourages grouping effect i.e. either selection or omission of the correlated variables together. However elastic net is also inconsistent. In this paper we study adaptive elastic net which does not have this drawback. In this article we specially focus on the grouped selection property of adaptive elastic net along with its model selection complexity. We also shed some light on the bias-variance tradeoff of different regularization methods including adaptive elastic net. An efficient algorithm was proposed in the line of LARS-EN, which is then illustrated with simulated as well as real life data examples.  相似文献   

4.
In this paper, we propose a new full iteration estimation method for quantile regression (QR) of the single-index model (SIM). The asymptotic properties of the proposed estimator are derived. Furthermore, we propose a variable selection procedure for the QR of SIM by combining the estimation method with the adaptive LASSO penalized method to get sparse estimation of the index parameter. The oracle properties of the variable selection method are established. Simulations with various non-normal errors are conducted to demonstrate the finite sample performance of the estimation method and the variable selection procedure. Furthermore, we illustrate the proposed method by analyzing a real data set.  相似文献   

5.
近年来多维心理测验被广泛应用于各类评估,虽然编制测验时知道整个测验考察的潜在特质(或称为维度),但是测验题目具体考察的维度仍需确定。借助多维项目反应理论模型与广义线性模型的关系,使用LASSO和弹性网两种变量筛选方法,可解决测验题目的维度识别问题。模拟研究发现,LASSO方法比弹性网方法具有更好的维度识别效果,前者对不同类型的多维测验具有较高的维度识别准确率。  相似文献   

6.
Regularization and variable selection via the elastic net   总被引:2,自引:0,他引:2  
Summary.  We propose the elastic net, a new regularization and variable selection method. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect, where strongly correlated predictors tend to be in or out of the model together. The elastic net is particularly useful when the number of predictors ( p ) is much bigger than the number of observations ( n ). By contrast, the lasso is not a very satisfactory variable selection method in the p ≫ n case. An algorithm called LARS-EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lasso.  相似文献   

7.
In this article, a new robust variable selection approach is introduced by combining the robust generalized estimating equations and adaptive LASSO penalty function for longitudinal generalized linear models. Then, an efficient weighted Gaussian pseudo-likelihood version of the BIC (WGBIC) is proposed to choose the tuning parameter in the process of robust variable selection and to select the best working correlation structure simultaneously. Meanwhile, the oracle properties of the proposed robust variable selection method are established and an efficient algorithm combining the iterative weighted least squares and minorization–maximization is proposed to implement robust variable selection and parameter estimation.  相似文献   

8.
We consider a linear regression model where there are group structures in covariates. The group LASSO has been proposed for group variable selections. Many nonconvex penalties such as smoothly clipped absolute deviation and minimax concave penalty were extended to group variable selection problems. The group coordinate descent (GCD) algorithm is used popularly for fitting these models. However, the GCD algorithms are hard to be applied to nonconvex group penalties due to computational complexity unless the design matrix is orthogonal. In this paper, we propose an efficient optimization algorithm for nonconvex group penalties by combining the concave convex procedure and the group LASSO algorithm. We also extend the proposed algorithm for generalized linear models. We evaluate numerical efficiency of the proposed algorithm compared to existing GCD algorithms through simulated data and real data sets.  相似文献   

9.
A new regularization method for regression models is proposed. The criterion to be minimized contains a penalty term which explicitly links strength of penalization to the correlation between predictors. Like the elastic net, the method encourages a grouping effect where strongly correlated predictors tend to be in or out of the model together. A boosted version of the penalized estimator, which is based on a new boosting method, allows to select variables. Real world data and simulations show that the method compares well to competing regularization techniques. In settings where the number of predictors is smaller than the number of observations it frequently performs better than competitors, in high dimensional settings prediction measures favor the elastic net while accuracy of estimation and stability of variable selection favors the newly proposed method.  相似文献   

10.
Credit scoring can be defined as the set of statistical models and techniques that help financial institutions in their credit decision makings. In this paper, we consider a coarse classification method based on fused least absolute shrinkage and selection operator (LASSO) penalization. By adopting fused LASSO, one can deal continuous as well as discrete variables in a unified framework. For computational efficiency, we develop a penalization path algorithm. Through numerical examples, we compare the performances of fused LASSO and LASSO with dummy variable coding.  相似文献   

11.
This paper concerns model selection for autoregressive time series when the observations are contaminated with trend. We propose an adaptive least absolute shrinkage and selection operator (LASSO) type model selection method, in which the trend is estimated by B-splines, the detrended residuals are calculated, and then the residuals are used as if they were observations to optimize an adaptive LASSO type objective function. The oracle properties of such an adaptive LASSO model selection procedure are established; that is, the proposed method can identify the true model with probability approaching one as the sample size increases, and the asymptotic properties of estimators are not affected by the replacement of observations with detrended residuals. The intensive simulation studies of several constrained and unconstrained autoregressive models also confirm the theoretical results. The method is illustrated by two time series data sets, the annual U.S. tobacco production and annual tree ring width measurements.  相似文献   

12.
This paper studies the outlier detection and robust variable selection problem in the linear regression model. The penalized weighted least absolute deviation (PWLAD) regression estimation method and the adaptive least absolute shrinkage and selection operator (LASSO) are combined to simultaneously achieve outlier detection, and robust variable selection. An iterative algorithm is proposed to solve the proposed optimization problem. Monte Carlo studies are evaluated the finite-sample performance of the proposed methods. The results indicate that the finite sample performance of the proposed methods performs better than that of the existing methods when there are leverage points or outliers in the response variable or explanatory variables. Finally, we apply the proposed methodology to analyze two real datasets.  相似文献   

13.
在广义线性模型假设下,采用Lin的医疗费用模型,运用LASSO和SCAD方法对影响医疗费用的因素进行选择,并对两种方法的有效性进行了对比分析,从而得出影响医疗保险赔付的重要因素,解决了高维变量带来的一系列问题。实例分析中,由于两种方法注重的统计性质不同,选择出的解释变量略微不同,但通过分析发现,两种结果都具有良好的解释性,反映了影响医疗保险赔付的重要信息。  相似文献   

14.
Variable selection methods have been widely used in the analysis of high-dimensional data, for example, gene expression microarray data and single nucleotide polymorphism data. A special feature of the genomic data is that genes participating in a common metabolic pathway or sharing a similar biological function tend to have high correlations. The collinearity naturally embedded in these data requires special handling, which cannot be provided by existing variable selection methods. In this paper, we propose a set of new methods to select variables in correlated data. The new methods follow the forward selection procedure of least angle regression (LARS) but conduct grouping and selecting at the same time. The methods specially work when no prior information on group structures of data is available. Simulations and real examples show that our proposed methods often outperform the existing variable selection methods, including LARS and elastic net, in terms of both reducing prediction error and preserving sparsity of representation.  相似文献   

15.
16.
Abstract. The Dantzig selector (DS) is a recent approach of estimation in high‐dimensional linear regression models with a large number of explanatory variables and a relatively small number of observations. As in the least absolute shrinkage and selection operator (LASSO), this approach sets certain regression coefficients exactly to zero, thus performing variable selection. However, such a framework, contrary to the LASSO, has never been used in regression models for survival data with censoring. A key motivation of this article is to study the estimation problem for Cox's proportional hazards (PH) function regression models using a framework that extends the theory, the computational advantages and the optimal asymptotic rate properties of the DS to the class of Cox's PH under appropriate sparsity scenarios. We perform a detailed simulation study to compare our approach with other methods and illustrate it on a well‐known microarray gene expression data set for predicting survival from gene expressions.  相似文献   

17.
Lots of semi-parametric and nonparametric models are used to fit nonlinear time series data. They include partially linear time series models, nonparametric additive models, and semi-parametric single index models. In this article, we focus on fitting time series data by partially linear additive model. Combining the orthogonal series approximation and the adaptive sparse group LASSO regularization, we select the important variables between and within the groups simultaneously. Specially, we propose a two-step algorithm to obtain the grouped sparse estimators. Numerical studies show that the proposed method outperforms LASSO method in both fitting and forecasting. An empirical analysis is used to illustrate the methodology.  相似文献   

18.
The accelerated failure time (AFT) models have proved useful in many contexts, though heavy censoring (as for example in cancer survival) and high dimensionality (as for example in microarray data) cause difficulties for model fitting and model selection. We propose new approaches to variable selection for censored data, based on AFT models optimized using regularized weighted least squares. The regularized technique uses a mixture of \(\ell _1\) and \(\ell _2\) norm penalties under two proposed elastic net type approaches. One is the adaptive elastic net and the other is weighted elastic net. The approaches extend the original approaches proposed by Ghosh (Adaptive elastic net: an improvement of elastic net to achieve oracle properties, Technical Reports 2007) and Hong and Zhang (Math Model Nat Phenom 5(3):115–133 2010), respectively. We also extend the two proposed approaches by adding censoring observations as constraints into their model optimization frameworks. The approaches are evaluated on microarray and by simulation. We compare the performance of these approaches with six other variable selection techniques-three are generally used for censored data and the other three are correlation-based greedy methods used for high-dimensional data.  相似文献   

19.
The least absolute shrinkage and selection operator (LASSO) is a prominent estimator which selects significant (under some sense) features and kills insignificant ones. Indeed the LASSO shrinks features larger than a noise level to zero. In this article, we force LASSO to be shrunken more by proposing a Stein-type shrinkage estimator emanating from the LASSO, namely the Stein-type LASSO. The newly proposed estimator proposes good performance in risk sense numerically. Variants of this estimator have smaller relative MSE and prediction error, compared to the LASSO, in the analysis of prostate cancer dataset.  相似文献   

20.
ESTIMATION, PREDICTION AND INFERENCE FOR THE LASSO RANDOM EFFECTS MODEL   总被引:1,自引:0,他引:1  
The least absolute shrinkage and selection operator (LASSO) can be formulated as a random effects model with an associated variance parameter that can be estimated with other components of variance. In this paper, estimation of the variance parameters is performed by means of an approximation to the marginal likelihood of the observed outcomes. The approximation is based on an alternative but equivalent formulation of the LASSO random effects model. Predictions can be made using point summaries of the predictive distribution of the random effects given the data with the parameters set to their estimated values. The standard LASSO method uses the mode of this distribution as the predictor. It is not the only choice, and a number of other possibilities are defined and empirically assessed in this article. The predictive mode is competitive with the predictive mean (best predictor), but no single predictor performs best across in all situations. Inference for the LASSO random effects is performed using predictive probability statements, which are more appropriate under the random effects formulation than tests of hypothesis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号