首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Oracle Inequalities for Convex Loss Functions with Nonlinear Targets   总被引:1,自引:1,他引:0  
This article considers penalized empirical loss minimization of convex loss functions with unknown target functions. Using the elastic net penalty, of which the Least Absolute Shrinkage and Selection Operator (Lasso) is a special case, we establish a finite sample oracle inequality which bounds the loss of our estimator from above with high probability. If the unknown target is linear, this inequality also provides an upper bound of the estimation error of the estimated parameter vector. Next, we use the non-asymptotic results to show that the excess loss of our estimator is asymptotically of the same order as that of the oracle. If the target is linear, we give sufficient conditions for consistency of the estimated parameter vector. We briefly discuss how a thresholded version of our estimator can be used to perform consistent variable selection. We give two examples of loss functions covered by our framework.  相似文献   

2.
The Lasso achieves variance reduction and variable selection by solving an ?1‐regularized least squares problem. Huang (2003) claims that ‘there always exists an interval of regularization parameter values such that the corresponding mean squared prediction error for the Lasso estimator is smaller than for the ordinary least square estimator’. This result is correct. However, its proof in Huang (2003) is not. This paper presents a corrected proof of the claim, which exposes and uses some interesting fundamental properties of the Lasso.  相似文献   

3.
The penalized maximum likelihood estimator (PMLE) has been widely used for variable selection in high-dimensional data. Various penalty functions have been employed for this purpose, e.g., Lasso, weighted Lasso, or smoothly clipped absolute deviations. However, the PMLE can be very sensitive to outliers in the data, especially to outliers in the covariates (leverage points). In order to overcome this disadvantage, the usage of the penalized maximum trimmed likelihood estimator (PMTLE) is proposed to estimate the unknown parameters in a robust way. The computation of the PMTLE takes advantage of the same technology as used for PMLE but here the estimation is based on subsamples only. The breakdown point properties of the PMTLE are discussed using the notion of $d$ -fullness. The performance of the proposed estimator is evaluated in a simulation study for the classical multiple linear and Poisson linear regression models.  相似文献   

4.
In survival studies, current status data are frequently encountered when some individuals in a study are not successively observed. This paper considers the problem of simultaneous variable selection and parameter estimation in the high-dimensional continuous generalized linear model with current status data. We apply the penalized likelihood procedure with the smoothly clipped absolute deviation penalty to select significant variables and estimate the corresponding regression coefficients. With a proper choice of tuning parameters, the resulting estimator is shown to be a root n/pn-consistent estimator under some mild conditions. In addition, we show that the resulting estimator has the same asymptotic distribution as the estimator obtained when the true model is known. The finite sample behavior of the proposed estimator is evaluated through simulation studies and a real example.  相似文献   

5.
Abstract

Structured sparsity has recently been a very popular technique to deal with the high-dimensional data. In this paper, we mainly focus on the theoretical problems for the overlapping group structure of generalized linear models (GLMs). Although the overlapping group lasso method for GLMs has been widely applied in some applications, the theoretical properties about it are still unknown. Under some general conditions, we presents the oracle inequalities for the estimation and prediction error of overlapping group Lasso method in the generalized linear model setting. Then, we apply these results to the so-called Logistic and Poisson regression models. It is shown that the results of the Lasso and group Lasso procedures for GLMs can be recovered by specifying the group structures in our proposed method. The effect of overlap and the performance of variable selection of our proposed method are both studied by numerical simulations. Finally, we apply our proposed method to two gene expression data sets: the p53 data and the lung cancer data.  相似文献   

6.
In high-dimensional setting, componentwise L2boosting has been used to construct sparse model that performs well, but it tends to select many ineffective variables. Several sparse boosting methods, such as, SparseL2Boosting and Twin Boosting, have been proposed to improve the variable selection of L2boosting algorithm. In this article, we propose a new general sparse boosting method (GSBoosting). The relations are established between GSBoosting and other well known regularized variable selection methods in the orthogonal linear model, such as adaptive Lasso, hard thresholds, etc. Simulation results show that GSBoosting has good performance in both prediction and variable selection.  相似文献   

7.
We consider estimation in a high-dimensional linear model with strongly correlated variables. We propose to cluster the variables first and do subsequent sparse estimation such as the Lasso for cluster-representatives or the group Lasso based on the structure from the clusters. Regarding the first step, we present a novel and bottom-up agglomerative clustering algorithm based on canonical correlations, and we show that it finds an optimal solution and is statistically consistent. We also present some theoretical arguments that canonical correlation based clustering leads to a better-posed compatibility constant for the design matrix which ensures identifiability and an oracle inequality for the group Lasso. Furthermore, we discuss circumstances where cluster-representatives and using the Lasso as subsequent estimator leads to improved results for prediction and detection of variables. We complement the theoretical analysis with various empirical results.  相似文献   

8.
《Statistics》2012,46(6):1234-1250
ABSTRACT

We consider principal varying coefficient models in the high-dimensional setting, combined with variable selection, to reduce the effective number of parameters in semiparametric modelling. The estimation is based on B-splines approach. For the unpenalized estimator, we establish non-asymptotic bounds of the estimator and then establish the (asymptotic) local oracle property of the penalized estimator, as well as non-asymptotic error bounds. Monte Carlo studies reveal the favourable performance of the estimator and an application on a real dataset is presented.  相似文献   

9.
In this article, we study a nonparametric approach regarding a general nonlinear reduced form equation to achieve a better approximation of the optimal instrument. Accordingly, we propose the nonparametric additive instrumental variable estimator (NAIVE) with the adaptive group Lasso. We theoretically demonstrate that the proposed estimator is root-n consistent and asymptotically normal. The adaptive group Lasso helps us select the valid instruments while the dimensionality of potential instrumental variables is allowed to be greater than the sample size. In practice, the degree and knots of B-spline series are selected by minimizing the BIC or EBIC criteria for each nonparametric additive component in the reduced form equation. In Monte Carlo simulations, we show that the NAIVE has the same performance as the linear instrumental variable (IV) estimator for the truly linear reduced form equation. On the other hand, the NAIVE performs much better in terms of bias and mean squared errors compared to other alternative estimators under the high-dimensional nonlinear reduced form equation. We further illustrate our method in an empirical study of international trade and growth. Our findings provide a stronger evidence that international trade has a significant positive effect on economic growth.  相似文献   

10.
We consider the problem of variables selection and estimation in linear regression model in situations where the number of parameters diverges with the sample size. We propose the adaptive Generalized Ridge-Lasso (\mboxAdaGril) which is an extension of the the adaptive Elastic Net. AdaGril incorporates information redundancy among correlated variables for model selection and estimation. It combines the strengths of the quadratic regularization and the adaptively weighted Lasso shrinkage. In this article, we highlight the grouped selection property for AdaCnet method (one type of AdaGril) in the equal correlation case. Under weak conditions, we establish the oracle property of AdaGril which ensures the optimal large performance when the dimension is high. Consequently, it achieves both goals of handling the problem of collinearity in high dimension and enjoys the oracle property. Moreover, we show that AdaGril estimator achieves a Sparsity Inequality, i.e., a bound in terms of the number of non-zero components of the “true” regression coefficient. This bound is obtained under a similar weak Restricted Eigenvalue (RE) condition used for Lasso. Simulations studies show that some particular cases of AdaGril outperform its competitors.  相似文献   

11.
Semiparametric regression models with multiple covariates are commonly encountered. When there are covariates not associated with response variable, variable selection may lead to sparser models, more lucid interpretations and more accurate estimation. In this study, we adopt a sieve approach for the estimation of nonparametric covariate effects in semiparametric regression models. We adopt a two-step iterated penalization approach for variable selection. In the first step, a mixture of the Lasso and group Lasso penalties are employed to conduct the first-round variable selection and obtain the initial estimate. In the second step, a mixture of the weighted Lasso and weighted group Lasso penalties, with weights constructed using the initial estimate, are employed for variable selection. We show that the proposed iterated approach has the variable selection consistency property, even when number of unknown parameters diverges with sample size. Numerical studies, including simulation and analysis of a diabetes dataset, show satisfactory performance of the proposed approach.  相似文献   

12.
Bayesian model averaging (BMA) is an effective technique for addressing model uncertainty in variable selection problems. However, current BMA approaches have computational difficulty dealing with data in which there are many more measurements (variables) than samples. This paper presents a method for combining ?1 regularization and Markov chain Monte Carlo model composition techniques for BMA. By treating the ?1 regularization path as a model space, we propose a method to resolve the model uncertainty issues arising in model averaging from solution path point selection. We show that this method is computationally and empirically effective for regression and classification in high-dimensional data sets. We apply our technique in simulations, as well as to some applications that arise in genomics.  相似文献   

13.
This article compares the mean-squared error (or ?2 risk) of ordinary least squares (OLS), James–Stein, and least absolute shrinkage and selection operator (Lasso) shrinkage estimators in simple linear regression where the number of regressors is smaller than the sample size. We compare and contrast the known risk bounds for these estimators, which shows that neither James–Stein nor Lasso uniformly dominates the other. We investigate the finite sample risk using a simple simulation experiment. We find that the risk of Lasso estimation is particularly sensitive to coefficient parameterization, and for a significant portion of the parameter space Lasso has higher mean-squared error than OLS. This investigation suggests that there are potential pitfalls arising with Lasso estimation, and simulation studies need to be more attentive to careful exploration of the parameter space.  相似文献   

14.
We propose a shrinkage procedure for simultaneous variable selection and estimation in generalized linear models (GLMs) with an explicit predictive motivation. The procedure estimates the coefficients by minimizing the Kullback-Leibler divergence of a set of predictive distributions to the corresponding predictive distributions for the full model, subject to an l 1 constraint on the coefficient vector. This results in selection of a parsimonious model with similar predictive performance to the full model. Thanks to its similar form to the original Lasso problem for GLMs, our procedure can benefit from available l 1-regularization path algorithms. Simulation studies and real data examples confirm the efficiency of our method in terms of predictive performance on future observations.  相似文献   

15.
In this paper, we study the asymptotic properties of the adaptive Lasso estimators in high-dimensional generalized linear models. The consistency of the adaptive Lasso estimator is obtained. We show that, if a reasonable initial estimator is available, under appropriate conditions, the adaptive Lasso correctly selects covariates with non zero coefficients with probability converging to one, and that the estimators of non zero coefficients have the same asymptotic distribution they would have if the zero coefficients were known in advance. Thus, the adaptive Lasso has an Oracle property. The results are examined by some simulations and a real example.  相似文献   

16.
Recent work has shown that the Lasso-based regularization is very useful for estimating the high-dimensional inverse covariance matrix. A particularly useful scheme is based on penalizing the ?1 norm of the off-diagonal elements to encourage sparsity. We embed this type of regularization into high-dimensional classification. A two-stage estimation procedure is proposed which first recovers structural zeros of the inverse covariance matrix and then enforces block sparsity by moving non-zeros closer to the main diagonal. We show that the block-diagonal approximation of the inverse covariance matrix leads to an additive classifier, and demonstrate that accounting for the structure can yield better performance accuracy. Effect of the block size on classification is explored, and a class of asymptotically equivalent structure approximations in a high-dimensional setting is specified. We suggest a variable selection at the block level and investigate properties of this procedure in growing dimension asymptotics. We present a consistency result on the feature selection procedure, establish asymptotic lower an upper bounds for the fraction of separative blocks and specify constraints under which the reliable classification with block-wise feature selection can be performed. The relevance and benefits of the proposed approach are illustrated on both simulated and real data.  相似文献   

17.
Abstract

Covariance estimation and selection for multivariate datasets in a high-dimensional regime is a fundamental problem in modern statistics. Gaussian graphical models are a popular class of models used for this purpose. Current Bayesian methods for inverse covariance matrix estimation under Gaussian graphical models require the underlying graph and hence the ordering of variables to be known. However, in practice, such information on the true underlying model is often unavailable. We therefore propose a novel permutation-based Bayesian approach to tackle the unknown variable ordering issue. In particular, we utilize multiple maximum a posteriori estimates under the DAG-Wishart prior for each permutation, and subsequently construct the final estimate of the inverse covariance matrix. The proposed estimator has smaller variability and yields order-invariant property. We establish posterior convergence rates under mild assumptions and illustrate that our method outperforms existing approaches in estimating the inverse covariance matrices via simulation studies.  相似文献   

18.
G. Aneiros  F. Ferraty  P. Vieu 《Statistics》2015,49(6):1322-1347
The problem of variable selection is considered in high-dimensional partial linear regression under some model allowing for possibly functional variable. The procedure studied is that of nonconcave-penalized least squares. It is shown the existence of a √n/sn-consistent estimator for the vector of pn linear parameters in the model, even when pn tends to ∞ as the sample size n increases (sn denotes the number of influential variables). An oracle property is also obtained for the variable selection method, and the nonparametric rate of convergence is stated for the estimator of the nonlinear functional component of the model. Finally, a simulation study illustrates the finite sample size performance of our procedure.  相似文献   

19.
Abstract

Variable selection is a fundamental challenge in statistical learning if one works with data sets containing huge amount of predictors. In this artical we consider procedures popular in model selection: Lasso and adaptive Lasso. Our goal is to investigate properties of estimators based on minimization of Lasso-type penalized empirical risk with a convex loss function, in particular nondifferentiable. We obtain theorems concerning rate of convergence in estimation, consistency in model selection and oracle properties for Lasso estimators if the number of predictors is fixed, i.e. it does not depend on the sample size. Moreover, we study properties of Lasso and adaptive Lasso estimators on simulated and real data sets.  相似文献   

20.
The L1-type regularization provides a useful tool for variable selection in high-dimensional regression modeling. Various algorithms have been proposed to solve optimization problems for L1-type regularization. Especially the coordinate descent algorithm has been shown to be effective in sparse regression modeling. Although the algorithm shows a remarkable performance to solve optimization problems for L1-type regularization, it suffers from outliers, since the procedure is based on the inner product of predictor variables and partial residuals obtained from a non-robust manner. To overcome this drawback, we propose a robust coordinate descent algorithm, especially focusing on the high-dimensional regression modeling based on the principal components space. We show that the proposed robust algorithm converges to the minimum value of its objective function. Monte Carlo experiments and real data analysis are conducted to examine the efficiency of the proposed robust algorithm. We observe that our robust coordinate descent algorithm effectively performs for the high-dimensional regression modeling even in the presence of outliers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号