首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 343 毫秒
1.
The smooth integration of counting and absolute deviation (SICA) penalty has been demonstrated theoretically and practically to be effective in non-convex penalization for variable selection. However, solving the non-convex optimization problem associated with the SICA penalty when the number of variables exceeds the sample size remains to be enriched due to the singularity at the origin and the non-convexity of the SICA penalty function. In this paper, we develop an efficient and accurate alternating direction method of multipliers with continuation algorithm for solving the SICA-penalized least squares problem in high dimensions. We establish the convergence property of the proposed algorithm under some mild regularity conditions and study the corresponding Karush–Kuhn–Tucker optimality condition. A high-dimensional Bayesian information criterion is developed to select the optimal tuning parameters. We conduct extensive simulations studies to evaluate the efficiency and accuracy of the proposed algorithm, while its practical usefulness is further illustrated with a high-dimensional microarray study.  相似文献   

2.
We consider a linear regression model where there are group structures in covariates. The group LASSO has been proposed for group variable selections. Many nonconvex penalties such as smoothly clipped absolute deviation and minimax concave penalty were extended to group variable selection problems. The group coordinate descent (GCD) algorithm is used popularly for fitting these models. However, the GCD algorithms are hard to be applied to nonconvex group penalties due to computational complexity unless the design matrix is orthogonal. In this paper, we propose an efficient optimization algorithm for nonconvex group penalties by combining the concave convex procedure and the group LASSO algorithm. We also extend the proposed algorithm for generalized linear models. We evaluate numerical efficiency of the proposed algorithm compared to existing GCD algorithms through simulated data and real data sets.  相似文献   

3.
In high-dimensional regression problems regularization methods have been a popular choice to address variable selection and multicollinearity. In this paper we study bridge regression that adaptively selects the penalty order from data and produces flexible solutions in various settings. We implement bridge regression based on the local linear and quadratic approximations to circumvent the nonconvex optimization problem. Our numerical study shows that the proposed bridge estimators are a robust choice in various circumstances compared to other penalized regression methods such as the ridge, lasso, and elastic net. In addition, we propose group bridge estimators that select grouped variables and study their asymptotic properties when the number of covariates increases along with the sample size. These estimators are also applied to varying-coefficient models. Numerical examples show superior performances of the proposed group bridge estimators in comparisons with other existing methods.  相似文献   

4.
High-dimensional data arise frequently in modern applications such as biology, chemometrics, economics, neuroscience and other scientific fields. The common features of high-dimensional data are that many of predictors may not be significant, and there exists high correlation among predictors. Generalized linear models, as the generalization of linear models, also suffer from the collinearity problem. In this paper, combining the nonconvex penalty and ridge regression, we propose the weighted elastic-net to deal with the variable selection of generalized linear models on high dimension and give the theoretical properties of the proposed method with a diverging number of parameters. The finite sample behavior of the proposed method is illustrated with simulation studies and a real data example.  相似文献   

5.
We propose marginalized lasso, a new nonconvex penalization for variable selection in regression problem. The marginalized lasso penalty is motivated from integrating out the penalty parameter in the original lasso penalty with a gamma prior distribution. This study provides a thresholding rule and a lasso-based iterative algorithm for parameter estimation in the marginalized lasso. We also provide a coordinate descent algorithm to efficiently optimize the marginalized lasso penalized regression. Numerical comparison studies are provided to demonstrate its competitiveness over the existing sparsity-inducing penalizations and suggest some guideline for tuning parameter selection.  相似文献   

6.
The group folded concave penalization problems have been shown to process the satisfactory oracle property theoretically. However, it remains unknown whether the optimization algorithm for solving the resulting nonconvex problem can find such oracle solution among multiple local solutions. In this paper, we extend the well-known local linear approximation (LLA) algorithm to solve the group folded concave penalization problem for the linear models. We prove that, with the group LASSO estimator as the initial value, the two-step LLA solution converges to the oracle estimator with overwhelming probability, and thus closing the theoretical gap. The results are high-dimensional which allow the group number to grow exponentially, the true relevant groups and the true maximum group size to grow polynomially. Numerical studies are also conducted to show the merits of the LLA procedure.  相似文献   

7.
Many problems in Statistics involve maximizing a multinomial likelihood over a restricted region. In this paper, we consider instead maximizing a weighted multinomial likelihood. We show that a dual problem always exits which is frequently more tractable and that a solution to the dual problem leads directly to a solution of the primal problem. Moreover, the form of the dual problem suggests an iterative algorithm for solving the MLE problem when the constraint region can be written as a finite intersection of cones. We show that this iterative algorithm is guaranteed to converge to the true solution and show that when the cones are isotonic, this algorithm is a version of Dykstra's algorithm (Dykstra, J. Amer. Statist. Assoc. 78 (1983) 837–842) for the special case of least squares projection onto the intersection of isotonic cones. We give several meaningful examples to illustrate our results. In particular, we obtain the nonparametric maximum likelihood estimator of a monotone density function in the presence of selection bias.  相似文献   

8.
Gibbs point processes (GPPs) constitute a large and flexible class of spatial point processes with explicit dependence between the points. They can model attractive as well as repulsive point patterns. Feature selection procedures are an important topic in high-dimensional statistical modeling. In this paper, a composite likelihood (in particular pseudo-likelihood) approach regularized with convex and nonconvex penalty functions is proposed to handle statistical inference for possibly high-dimensional inhomogeneous GPPs. We particularly investigate the setting where the number of covariates diverges as the domain of observation increases. Under some conditions provided on the spatial GPP and on penalty functions, we show that the oracle property, consistency and asymptotic normality hold. Our results also cover the low-dimensional case which fills a large gap in the literature. Through simulation experiments, we validate our theoretical results and finally, an application to a tropical forestry dataset illustrates the use of the proposed approach.  相似文献   

9.
From the prediction viewpoint, mode regression is more attractive since it pay attention to the most probable value of response variable given regressors. On the other hand, high-dimensional data are very prevalent as the advance of the technology of collecting and storing data. Variable selection is an important strategy to deal with high-dimensional regression problem. This paper aims to propose a variable selection procedure for high-dimensional mode regression via combining nonparametric kernel estimation method with sparsity penalty tactics. We also establish the asymptotic properties under certain technical conditions. The effectiveness and flexibility of the proposed methods are further illustrated by numerical studies and the real data application.  相似文献   

10.
In this paper, we propose a lower bound based smoothed quasi-Newton algorithm for computing the solution paths of the group bridge estimator in linear regression models. Our method is based on the quasi-Newton algorithm with a smoothed group bridge penalty in combination with a novel data-driven thresholding rule for the regression coefficients. This rule is derived based on a necessary KKT condition of the group bridge optimization problem. It is easy to implement and can be used to eliminate groups with zero coefficients. Thus, it reduces the dimension of the optimization problem. The proposed algorithm removes the restriction of groupwise orthogonal condition needed in coordinate descent and LARS algorithms for group variable selection. Numerical results show that the proposed algorithm outperforms the coordinate descent based algorithms in both efficiency and accuracy.  相似文献   

11.
We consider the problem of variable selection in high-dimensional partially linear models with longitudinal data. A variable selection procedure is proposed based on the smooth-threshold generalized estimating equation (SGEE). The proposed procedure automatically eliminates inactive predictors by setting the corresponding parameters to be zero, and simultaneously estimates the nonzero regression coefficients by solving the SGEE. We establish the asymptotic properties in a high-dimensional framework where the number of covariates pn increases as the number of clusters n increases. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed variable selection procedure.  相似文献   

12.
To perform variable selection in expectile regression, we introduce the elastic-net penalty into expectile regression and propose an elastic-net penalized expectile regression (ER-EN) model. We then adopt the semismooth Newton coordinate descent (SNCD) algorithm to solve the proposed ER-EN model in high-dimensional settings. The advantages of ER-EN model are illustrated via extensive Monte Carlo simulations. The numerical results show that the ER-EN model outperforms the elastic-net penalized least squares regression (LSR-EN), the elastic-net penalized Huber regression (HR-EN), the elastic-net penalized quantile regression (QR-EN) and conventional expectile regression (ER) in terms of variable selection and predictive ability, especially for asymmetric distributions. We also apply the ER-EN model to two real-world applications: relative location of CT slices on the axial axis and metabolism of tacrolimus (Tac) drug. Empirical results also demonstrate the superiority of the ER-EN model.  相似文献   

13.
The L1-type regularization provides a useful tool for variable selection in high-dimensional regression modeling. Various algorithms have been proposed to solve optimization problems for L1-type regularization. Especially the coordinate descent algorithm has been shown to be effective in sparse regression modeling. Although the algorithm shows a remarkable performance to solve optimization problems for L1-type regularization, it suffers from outliers, since the procedure is based on the inner product of predictor variables and partial residuals obtained from a non-robust manner. To overcome this drawback, we propose a robust coordinate descent algorithm, especially focusing on the high-dimensional regression modeling based on the principal components space. We show that the proposed robust algorithm converges to the minimum value of its objective function. Monte Carlo experiments and real data analysis are conducted to examine the efficiency of the proposed robust algorithm. We observe that our robust coordinate descent algorithm effectively performs for the high-dimensional regression modeling even in the presence of outliers.  相似文献   

14.
NETWORK EXPLORATION VIA THE ADAPTIVE LASSO AND SCAD PENALTIES   总被引:1,自引:0,他引:1  
Graphical models are frequently used to explore networks, such as genetic networks, among a set of variables. This is usually carried out via exploring the sparsity of the precision matrix of the variables under consideration. Penalized likelihood methods are often used in such explorations. Yet, positive-definiteness constraints of precision matrices make the optimization problem challenging. We introduce non-concave penalties and the adaptive LASSO penalty to attenuate the bias problem in the network estimation. Through the local linear approximation to the non-concave penalty functions, the problem of precision matrix estimation is recast as a sequence of penalized likelihood problems with a weighted L(1) penalty and solved using the efficient algorithm of Friedman et al. (2008). Our estimation schemes are applied to two real datasets. Simulation experiments and asymptotic theory are used to justify our proposed methods.  相似文献   

15.
A number of variable selection methods have been proposed involving nonconvex penalty functions. These methods, which include the smoothly clipped absolute deviation (SCAD) penalty and the minimax concave penalty (MCP), have been demonstrated to have attractive theoretical properties, but model fitting is not a straightforward task, and the resulting solutions may be unstable. Here, we demonstrate the potential of coordinate descent algorithms for fitting these models, establishing theoretical convergence properties and demonstrating that they are significantly faster than competing approaches. In addition, we demonstrate the utility of convexity diagnostics to determine regions of the parameter space in which the objective function is locally convex, even though the penalty is not. Our simulation study and data examples indicate that nonconvex penalties like MCP and SCAD are worthwhile alternatives to the lasso in many applications. In particular, our numerical results suggest that MCP is the preferred approach among the three methods.  相似文献   

16.
The problem of dimension reduction in multiple regressions is investigated in this paper, in which data are from several populations that share the same variables. Assuming that the set of relevant predictors is the same across the regressions, a joint estimation and selection method is proposed, aiming to preserve the common structure, while allowing for population-specific characteristics. The new approach is based upon the relationship between sliced inverse regression and multiple linear regression, and is achieved through the lasso shrinkage penalty. A fast alternating algorithm is developed to solve the corresponding optimization problem. The performance of the proposed method is illustrated through simulated and real data examples.  相似文献   

17.
This paper studies the outlier detection and robust variable selection problem in the linear regression model. The penalized weighted least absolute deviation (PWLAD) regression estimation method and the adaptive least absolute shrinkage and selection operator (LASSO) are combined to simultaneously achieve outlier detection, and robust variable selection. An iterative algorithm is proposed to solve the proposed optimization problem. Monte Carlo studies are evaluated the finite-sample performance of the proposed methods. The results indicate that the finite sample performance of the proposed methods performs better than that of the existing methods when there are leverage points or outliers in the response variable or explanatory variables. Finally, we apply the proposed methodology to analyze two real datasets.  相似文献   

18.
宋鹏等 《统计研究》2020,37(7):116-128
高维协方差矩阵的估计问题现已成为大数据统计分析中的基本问题,传统方法要求数据满足正态分布假定且未考虑异常值影响,当前已无法满足应用需要,更加稳健的估计方法亟待被提出。针对高维协方差矩阵,一种稳健的基于子样本分组的均值-中位数估计方法被提出且简单易行,然而此方法估计的矩阵并不具备正定稀疏特性。基于此问题,本文引进一种中心正则化算法,弥补了原始方法的缺陷,通过在求解过程中对估计矩阵的非对角元素施加L1范数惩罚,使估计的矩阵具备正定稀疏的特性,显著提高了其应用价值。在数值模拟中,本文所提出的中心正则稳健估计有着更高的估计精度,同时更加贴近真实设定矩阵的稀疏结构。在后续的投资组合实证分析中,与传统样本协方差矩阵估计方法、均值-中位数估计方法和RA-LASSO方法相比,基于中心正则稳健估计构造的最小方差投资组合收益率有着更低的波动表现。  相似文献   

19.
Sequential minimal optimization (SMO) algorithm is effective in solving large-scale support vector machine (SVM). The existing algorithms all assume that the kernels are positive definite (PD) or positive semi-definite (PSD) and should meet the Mercer condition. Some kernels, however, such as sigmoid kernel, which originates from neural network and then is extensively used in SVM, are conditionally PD in certain circumstances; in addition, practically, it is often difficult to prove whether a kernel is PD or PSD or not except some well-known kernels. So, the applications of the existing algorithm of SMO are limited. Considering the deficiency of the traditional ones, this algorithm of solving ?-SVR with nonpositive semi-definite (non-PSD) kernels is proposed. Different from the existing algorithms which must consider four Lagrange multipliers, the algorithm proposed in this article just need to consider two Lagrange multipliers in the process of implementation. The proposed algorithm simplified the implementation by expanding the original dual programming of ?-SVR and solving its KKT conditions, thus being easily applied in solving ?-SVR with non-PSD kernels. The presented algorithm is evaluated using five benchmark problems and one reality problem. The results show that ?-SVR with non-PSD provides more accurate prediction than that with PD kernel.  相似文献   

20.
In this paper, a new method for robust principal component analysis (PCA) is proposed. PCA is a widely used tool for dimension reduction without substantial loss of information. However, the classical PCA is vulnerable to outliers due to its dependence on the empirical covariance matrix. To avoid such weakness, several alternative approaches based on robust scatter matrix were suggested. A popular choice is ROBPCA that combines projection pursuit ideas with robust covariance estimation via variance maximization criterion. Our approach is based on the fact that PCA can be formulated as a regression-type optimization problem, which is the main difference from the previous approaches. The proposed robust PCA is derived by substituting square loss function with a robust penalty function, Huber loss function. A practical algorithm is proposed in order to implement an optimization computation, and furthermore, convergence properties of the algorithm are investigated. Results from a simulation study and a real data example demonstrate the promising empirical properties of the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号