首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We propose a shrinkage procedure for simultaneous variable selection and estimation in generalized linear models (GLMs) with an explicit predictive motivation. The procedure estimates the coefficients by minimizing the Kullback-Leibler divergence of a set of predictive distributions to the corresponding predictive distributions for the full model, subject to an l 1 constraint on the coefficient vector. This results in selection of a parsimonious model with similar predictive performance to the full model. Thanks to its similar form to the original Lasso problem for GLMs, our procedure can benefit from available l 1-regularization path algorithms. Simulation studies and real data examples confirm the efficiency of our method in terms of predictive performance on future observations.  相似文献   

2.
The adaptive least absolute shrinkage and selection operator (Lasso) and least absolute deviation (LAD)-Lasso are two attractive shrinkage methods for simultaneous variable selection and regression parameter estimation. While the adaptive Lasso is efficient for small magnitude errors, LAD-Lasso is robust against heavy-tailed errors and severe outliers. In this article, we consider a data-driven convex combination of these two modern procedures to produce a robust adaptive Lasso, which not only enjoys the oracle properties, but synthesizes the advantages of the adaptive Lasso and LAD-Lasso. It fully adapts to different error structures including the infinite variance case and automatically chooses the optimal weight to achieve both robustness and high efficiency. Extensive simulation studies demonstrate a good finite sample performance of the robust adaptive Lasso. Two data sets are analyzed to illustrate the practical use of the procedure.  相似文献   

3.
With the quantile regression methods successfully applied in various applications, we often need to tackle with the big dataset with thousands of variables and millions of observations. In this article, we focus on the variable selection aspect of penalized quantile regression, and propose a new method Sampling Lasso Quantile Regression (SLQR), which allows selecting a small amount but informative data for fitting quantile regression models. Different from the ordinary regularization methods, this SLQR method performs a sampling technique to reduce the number of observations before applying Lasso. Through numerical simulation studies and real application in Greenhouse Gas Observing Network, we illustrate the efficacy of the SLQR method. The numerical results show that the SLQR method is able to achieve a high-precision quantile regression on large-scale data for both prediction and interpretation.  相似文献   

4.
Identification of influential genes and clinical covariates on the survival of patients is crucial because it can lead us to better understanding of underlying mechanism of diseases and better prediction models. Most of variable selection methods in penalized Cox models cannot deal properly with categorical variables such as gender and family history. The group lasso penalty can combine clinical and genomic covariates effectively. In this article, we introduce an optimization algorithm for Cox regression with group lasso penalty. We compare our method with other methods on simulated and real microarray data sets.  相似文献   

5.
This article compares the mean-squared error (or ?2 risk) of ordinary least squares (OLS), James–Stein, and least absolute shrinkage and selection operator (Lasso) shrinkage estimators in simple linear regression where the number of regressors is smaller than the sample size. We compare and contrast the known risk bounds for these estimators, which shows that neither James–Stein nor Lasso uniformly dominates the other. We investigate the finite sample risk using a simple simulation experiment. We find that the risk of Lasso estimation is particularly sensitive to coefficient parameterization, and for a significant portion of the parameter space Lasso has higher mean-squared error than OLS. This investigation suggests that there are potential pitfalls arising with Lasso estimation, and simulation studies need to be more attentive to careful exploration of the parameter space.  相似文献   

6.
ABSTRACT

Restricted canonical correlation analysis and the lasso shrinkage method were paired together for canonical correlation analysis with non-negativity restrictions on datasets, where a sample size is much smaller than the number of variables. The method was implemented in an alternating least-squares algorithm and applied to cross-language information retrieval on a dataset with aligned documents in eight languages. A set of experiments was ran to evaluate the method and compare it to other methods in the field.  相似文献   

7.
The Hilbert–Huang transform uses the empirical mode decomposition (EMD) method to analyze nonlinear and nonstationary data. This method breaks a time series of data into several orthogonal sequences based on differences in frequency. These data components include the intrinsic mode functions (IMFs) and the final residue. Although IMFs have been used in the past as predictors for other variables, very little effort has been devoted to identifying the most effective predictors among IMFs. As lasso is a widely used method for feature selection within complex datasets, the main objective of this article is to present a lasso regression based on the EMD method for choosing decomposed components that exhibit the strongest effects. Both numerical experiments and empirical results show that the proposed modeling process can use time-frequency structure within data to reveal interactions between two variables. This allows for more accurate predictions concerning future events.  相似文献   

8.
The least squares fit in a linear regression is always unique even when the design matrix has rank deficiency. In this paper, we extend this classic result to linearly constrained generalized lasso. It is shown that under a mild condition, the fit can be represented as a projection onto a polytope and, hence, is unique no matter whether design matrix X has full column rank or not. Furthermore, a formula for the degrees of freedom is derived to characterize the effective number of parameters. It directly yields an unbiased estimate of degrees of freedom, which can be incorporated in an information criterion for model selection.  相似文献   

9.
The lasso procedure is an estimator‐shrinkage and variable selection method. This paper shows that there always exists an interval of tuning parameter values such that the corresponding mean squared prediction error for the lasso estimator is smaller than for the ordinary least squares estimator. For an estimator satisfying some condition such as unbiasedness, the paper defines the corresponding generalized lasso estimator. Its mean squared prediction error is shown to be smaller than that of the estimator for values of the tuning parameter in some interval. This implies that all unbiased estimators are not admissible. Simulation results for five models support the theoretical results.  相似文献   

10.
Abstract

Variable selection is a fundamental challenge in statistical learning if one works with data sets containing huge amount of predictors. In this artical we consider procedures popular in model selection: Lasso and adaptive Lasso. Our goal is to investigate properties of estimators based on minimization of Lasso-type penalized empirical risk with a convex loss function, in particular nondifferentiable. We obtain theorems concerning rate of convergence in estimation, consistency in model selection and oracle properties for Lasso estimators if the number of predictors is fixed, i.e. it does not depend on the sample size. Moreover, we study properties of Lasso and adaptive Lasso estimators on simulated and real data sets.  相似文献   

11.
The problem of detecting multiple undocumented change-points in a historical temperature sequence with simple linear trend is formulated by a linear model. We apply adaptive least absolute shrinkage and selection operator (Lasso) to estimate the number and locations of change-points. Model selection criteria are used to choose the Lasso smoothing parameter. As adaptive Lasso may overestimate the number of change-points, we perform post-selection on change-points detected by adaptive Lasso using multivariate t simultaneous confidence intervals. Our method is demonstrated on the annual temperature data (year: 1902–2000) from Tuscaloosa, Alabama.  相似文献   

12.
Penalization has been extensively adopted for variable selection in regression. In some applications, covariates have natural grouping structures, where those in the same group have correlated measurements or related functions. Under such settings, variable selection should be conducted at both the group-level and within-group-level, that is, a bi-level selection. In this study, we propose the adaptive sparse group Lasso (adSGL) method, which combines the adaptive Lasso and adaptive group Lasso (GL) to achieve bi-level selection. It can be viewed as an improved version of sparse group Lasso (SGL) and uses data-dependent weights to improve selection performance. For computation, a block coordinate descent algorithm is adopted. Simulation shows that adSGL has satisfactory performance in identifying both individual variables and groups and lower false discovery rate and mean square error than SGL and GL. We apply the proposed method to the analysis of a household healthcare expenditure data set.  相似文献   

13.
通常所说的Granger因果关系检验,实际上是对线性因果关系的检验,无法检验非线性因果关系。Peguin和Terasvirta(1999)进行了基于泰勒展式的一般性扩展,应用于非线性因果关系检验,并采用提取主成分的方法解决其中的多重共线性问题。但是,提取主成分对解决多重共线性的效果并不太好。Lasso回归是目前处理多重共线性的主要方法之一,相对于其他方法,更容易产生稀疏解,在参数估计的同时实现变量选择,因而可以用来解决检验中的多重共线性问题,以提高检验的效率。对检验程序的模拟结果表明,基于Lasso回归的检验取得较好的效果。  相似文献   

14.
在贝叶斯Lasso分位数回归中,样本似然函数的计算和后验分布的抽样通常难以处理.针对这一问题,文章采用一种基于线性插值的似然函数计算方法,并结合拉普拉斯先验分布,设计出一种新的对后验分布进行抽样的算法.数值模拟结果表明了该方法具有较好的适应性和参数估计准确性.  相似文献   

15.
One of the standard variable selection procedures in multiple linear regression is to use a penalisation technique in least‐squares (LS) analysis. In this setting, many different types of penalties have been introduced to achieve variable selection. It is well known that LS analysis is sensitive to outliers, and consequently outliers can present serious problems for the classical variable selection procedures. Since rank‐based procedures have desirable robustness properties compared to LS procedures, we propose a rank‐based adaptive lasso‐type penalised regression estimator and a corresponding variable selection procedure for linear regression models. The proposed estimator and variable selection procedure are robust against outliers in both response and predictor space. Furthermore, since rank regression can yield unstable estimators in the presence of multicollinearity, in order to provide inference that is robust against multicollinearity, we adjust the penalty term in the adaptive lasso function by incorporating the standard errors of the rank estimator. The theoretical properties of the proposed procedures are established and their performances are investigated by means of simulations. Finally, the estimator and variable selection procedure are applied to the Plasma Beta‐Carotene Level data set.  相似文献   

16.
This article considers the adaptive lasso procedure for the accelerated failure time model with multiple covariates based on weighted least squares method, which uses Kaplan-Meier weights to account for censoring. The adaptive lasso method can complete the variable selection and model estimation simultaneously. Under some mild conditions, the estimator is shown to have sparse and oracle properties. We use Bayesian Information Criterion (BIC) for tuning parameter selection, and a bootstrap variance approach for standard error. Simulation studies and two real data examples are carried out to investigate the performance of the proposed method.  相似文献   

17.
The group Lasso is a penalized regression method, used in regression problems where the covariates are partitioned into groups to promote sparsity at the group level [27 M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables, J. R. Stat. Soc. Ser. B 68 (2006), pp. 4967. doi: 10.1111/j.1467-9868.2005.00532.x[Crossref] [Google Scholar]]. Quantile group Lasso, a natural extension of quantile Lasso [25 Y. Wu and Y. Liu, Variable selection in quantile regression, Statist. Sinica 19 (2009), pp. 801817.[Web of Science ®] [Google Scholar]], is a good alternative when the data has group information and has many outliers and/or heavy tails. How to discover important features that are correlated with interest of outcomes and immune to outliers has been paid much attention. In many applications, however, we may also want to keep the flexibility of selecting variables within a group. In this paper, we develop a sparse group variable selection based on quantile methods which select important covariates at both the group level and within the group level, which penalizes the empirical check loss function by the sum of square root group-wise L1-norm penalties. The oracle properties are established where the number of parameters diverges. We also apply our new method to varying coefficient model with categorial effect modifiers. Simulations and real data example show that the newly proposed method has robust and superior performance.  相似文献   

18.
Determination of the best subset is an important step in vector autoregressive (VAR) modeling. Traditional methods either conduct subset selection and parameter estimation separately or compute expensively. In this article, we propose a VAR model selection procedure using adaptive Lasso, for it is computational efficient and can select subset and estimate parameters simultaneously. By proper choice of tuning parameters, we can choose the correct subset and obtain the asymptotic normality of the non zero parameters. Simulation studies and real data analysis show that adaptive Lasso performs better than existing methods in VAR model fitting and prediction.  相似文献   

19.
文章将自适应Lasso变量选择方法扩展到变系数向量自回归模型(TVP-VAR)中.利用所提出方法对2005-2014年航空煤油价格与民航货邮与旅客周转量月度数据进行分析,并与其他四种方法进行了比较,结果显示:与常系数VAR模型相比,变系数VAR模型能够显著提高模型的拟合与预测精度.提出的自适应Lasso变系数模型一致优于Belmonte,Koop和Korobolis(2014)提出的Lasso变系数模型.  相似文献   

20.
面板数据的自适应Lasso分位回归方法研究   总被引:1,自引:0,他引:1  
如何在对参数进行估计的同时自动选择重要解释变量,一直是面板数据分位回归模型中讨论的热点问题之一。通过构造一种含多重随机效应的贝叶斯分层分位回归模型,在假定固定效应系数先验服从一种新的条件Laplace分布的基础上,给出了模型参数估计的Gibbs抽样算法。考虑到不同重要程度的解释变量权重系数压缩程度应该不同,所构造的先验信息具有自适应性的特点,能够准确地对模型中重要解释变量进行自动选取,且设计的切片Gibbs抽样算法能够快速有效地解决模型中各个参数的后验均值估计问题。模拟结果显示,新方法在参数估计精确度和变量选择准确度上均优于现有文献的常用方法。通过对中国各地区多个宏观经济指标的面板数据进行建模分析,演示了新方法估计参数与挑选变量的能力。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号