首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper, we consider a single-index regression model for which we propose a robust estimation procedure for the model parameters and an efficient variable selection of relevant predictors. The proposed method is known as the penalized generalized signed-rank procedure. Asymptotic properties of the proposed estimator are established under mild regularity conditions. Extensive Monte Carlo simulation experiments are carried out to study the finite sample performance of the proposed approach. The simulation results demonstrate that the proposed method dominates many of the existing ones in terms of robustness of estimation and efficiency of variable selection. Finally, a real data example is given to illustrate the method.  相似文献   

2.
Analysis of massive datasets is challenging owing to limitations of computer primary memory. Composite quantile regression (CQR) is a robust and efficient estimation method. In this paper, we extend CQR to massive datasets and propose a divide-and-conquer CQR method. The basic idea is to split the entire dataset into several blocks, applying the CQR method for data in each block, and finally combining these regression results via weighted average. The proposed approach significantly reduces the required amount of primary memory, and the resulting estimate will be as efficient as if the entire data set is analysed simultaneously. Moreover, to improve the efficiency of CQR, we propose a weighted CQR estimation approach. To achieve sparsity with high-dimensional covariates, we develop a variable selection procedure to select significant parametric components and prove the method possessing the oracle property. Both simulations and data analysis are conducted to illustrate the finite sample performance of the proposed methods.  相似文献   

3.
Abstract

In this article, we study the variable selection and estimation for linear regression models with missing covariates. The proposed estimation method is almost as efficient as the popular least-squares-based estimation method for normal random errors and empirically shown to be much more efficient and robust with respect to heavy tailed errors or outliers in the responses and covariates. To achieve sparsity, a variable selection procedure based on SCAD is proposed to conduct estimation and variable selection simultaneously. The procedure is shown to possess the oracle property. To deal with the covariates missing, we consider the inverse probability weighted estimators for the linear model when the selection probability is known or unknown. It is shown that the estimator by using estimated selection probability has a smaller asymptotic variance than that with true selection probability, thus is more efficient. Therefore, the important Horvitz-Thompson property is verified for penalized rank estimator with the covariates missing in the linear model. Some numerical examples are provided to demonstrate the performance of the estimators.  相似文献   

4.
In this article, we consider the variable selection for a class of semiparametric instrumental variable models. By combining orthogonal weighting technology and empirical likelihood method, we propose an orthogonal weighted empirical likelihood-based variable selection procedure. Under some mild conditions, the consistency and sparsity of the variable selection procedure are studied. Furthermore, some simulation studies and a real data analysis are carried out to examine the finite-sample performance of the proposed method.  相似文献   

5.
This paper considers robust variable selection in semiparametric modeling for longitudinal data with an unspecified dependence structure. First, by basis spline approximation and using a general formulation to treat mean, median, quantile and robust mean regressions in one setting, we propose a weighted M-type regression estimator, which achieves robustness against outliers in both the response and covariates directions, and can accommodate heterogeneity, and the asymptotic properties are also established. Furthermore, a penalized weighted M-type estimator is proposed, which can do estimation and select relevant nonparametric and parametric components simultaneously, and robustly. Without any specification of error distribution and intra-subject dependence structure, the variable selection method works beautifully, including consistency in variable selection and oracle property in estimation. Simulation studies also confirm our method and theories.  相似文献   

6.
Although the t-type estimator is a kind of M-estimator with scale optimization, it has some advantages over the M-estimator. In this article, we first propose a t-type joint generalized linear model as a robust extension to the classical joint generalized linear models for modeling data containing extreme or outlying observations. Next, we develop a t-type pseudo-likelihood (TPL) approach, which can be viewed as a robust version to the existing pseudo-likelihood (PL) approach. To determine which variables significantly affect the variance of the response variable, we then propose a unified penalized maximum TPL method to simultaneously select significant variables for the mean and dispersion models in t-type joint generalized linear models. Thus, the proposed variable selection method can simultaneously perform parameter estimation and variable selection in the mean and dispersion models. With appropriate selection of the tuning parameters, we establish the consistency and the oracle property of the regularized estimators. Simulation studies are conducted to illustrate the proposed methods.  相似文献   

7.
This paper studies the outlier detection and robust variable selection problem in the linear regression model. The penalized weighted least absolute deviation (PWLAD) regression estimation method and the adaptive least absolute shrinkage and selection operator (LASSO) are combined to simultaneously achieve outlier detection, and robust variable selection. An iterative algorithm is proposed to solve the proposed optimization problem. Monte Carlo studies are evaluated the finite-sample performance of the proposed methods. The results indicate that the finite sample performance of the proposed methods performs better than that of the existing methods when there are leverage points or outliers in the response variable or explanatory variables. Finally, we apply the proposed methodology to analyze two real datasets.  相似文献   

8.
We consider a linear regression model where there are group structures in covariates. The group LASSO has been proposed for group variable selections. Many nonconvex penalties such as smoothly clipped absolute deviation and minimax concave penalty were extended to group variable selection problems. The group coordinate descent (GCD) algorithm is used popularly for fitting these models. However, the GCD algorithms are hard to be applied to nonconvex group penalties due to computational complexity unless the design matrix is orthogonal. In this paper, we propose an efficient optimization algorithm for nonconvex group penalties by combining the concave convex procedure and the group LASSO algorithm. We also extend the proposed algorithm for generalized linear models. We evaluate numerical efficiency of the proposed algorithm compared to existing GCD algorithms through simulated data and real data sets.  相似文献   

9.
This paper focuses on robust estimation and variable selection for partially linear models. We combine the weighted least absolute deviation (WLAD) regression with the adaptive least absolute shrinkage and selection operator (LASSO) to achieve simultaneous robust estimation and variable selection for partially linear models. Compared with the LAD-LASSO method, the WLAD-LASSO method will resist to the heavy-tailed errors and outliers in the parametric components. In addition, we estimate the unknown smooth function by a robust local linear regression. Under some regular conditions, the theoretical properties of the proposed estimators are established. We further examine finite-sample performance of the proposed procedure by simulation studies and a real data example.  相似文献   

10.
High-dimensional data arise frequently in modern applications such as biology, chemometrics, economics, neuroscience and other scientific fields. The common features of high-dimensional data are that many of predictors may not be significant, and there exists high correlation among predictors. Generalized linear models, as the generalization of linear models, also suffer from the collinearity problem. In this paper, combining the nonconvex penalty and ridge regression, we propose the weighted elastic-net to deal with the variable selection of generalized linear models on high dimension and give the theoretical properties of the proposed method with a diverging number of parameters. The finite sample behavior of the proposed method is illustrated with simulation studies and a real data example.  相似文献   

11.
Regularized variable selection is a powerful tool for identifying the true regression model from a large number of candidates by applying penalties to the objective functions. The penalty functions typically involve a tuning parameter that controls the complexity of the selected model. The ability of the regularized variable selection methods to identify the true model critically depends on the correct choice of the tuning parameter. In this study, we develop a consistent tuning parameter selection method for regularized Cox's proportional hazards model with a diverging number of parameters. The tuning parameter is selected by minimizing the generalized information criterion. We prove that, for any penalty that possesses the oracle property, the proposed tuning parameter selection method identifies the true model with probability approaching one as sample size increases. Its finite sample performance is evaluated by simulations. Its practical use is demonstrated in The Cancer Genome Atlas breast cancer data.  相似文献   

12.
This article proposes a variable selection approach for zero-inflated count data analysis based on the adaptive lasso technique. Two models including the zero-inflated Poisson and the zero-inflated negative binomial are investigated. An efficient algorithm is used to minimize the penalized log-likelihood function in an approximate manner. Both the generalized cross-validation and Bayesian information criterion procedures are employed to determine the optimal tuning parameter, and a consistent sandwich formula of standard errors for nonzero estimates is given based on local quadratic approximation. We evaluate the performance of the proposed adaptive lasso approach through extensive simulation studies, and apply it to analyze real-life data about doctor visits.  相似文献   

13.
ABSTRACT

In this paper, we study a novelly robust variable selection and parametric component identification simultaneously in varying coefficient models. The proposed estimator is based on spline approximation and two smoothly clipped absolute deviation (SCAD) penalties through rank regression, which is robust with respect to heavy-tailed errors or outliers in the response. Furthermore, when the tuning parameter is chosen by modified BIC criterion, we show that the proposed procedure is consistent both in variable selection and the separation of varying and constant coefficients. In addition, the estimators of varying coefficients possess the optimal convergence rate under some assumptions, and the estimators of constant coefficients have the same asymptotic distribution as their counterparts obtained when the true model is known. Simulation studies and a real data example are undertaken to assess the finite sample performance of the proposed variable selection procedure.  相似文献   

14.
We propose a robust rank-based estimation and variable selection in double generalized linear models when the number of parameters diverges with the sample size. The consistency of the variable selection procedure and asymptotic properties of the resulting estimators are established under appropriate selection of tuning parameters. Simulations are performed to assess the finite sample performance of the proposed estimation and variable selection procedure. In the presence of gross outliers, the proposed method is showing that the variable selection method works better. For practical application, a real data application is provided using nutritional epidemiology data, in which we explore the relationship between plasma beta-carotene levels and personal characteristics (e.g. age, gender, fat, etc.) as well as dietary factors (e.g. smoking status, intake of cholesterol, etc.).  相似文献   

15.
Variable selection is fundamental to high-dimensional multivariate generalized linear models. The smoothly clipped absolute deviation (SCAD) method can solve the problem of variable selection and estimation. The choice of the tuning parameter in the SCAD method is critical, which controls the complexity of the selected model. This article proposes a criterion to select the tuning parameter for the SCAD method in multivariate generalized linear models, which is shown to be able to identify the true model consistently. Simulation studies are conducted to support theoretical findings, and two real data analysis are given to illustrate the proposed method.  相似文献   

16.
A fully nonparametric model may not perform well or when the researcher wants to use a parametric model but the functional form with respect to a subset of the regressors or the density of the errors is not known. This becomes even more challenging when the data contain gross outliers or unusual observations. However, in practice the true covariates are not known in advance, nor is the smoothness of the functional form. A robust model selection approach through which we can choose the relevant covariates components and estimate the smoothing function may represent an appealing tool to the solution. A weighted signed-rank estimation and variable selection under the adaptive lasso for semi-parametric partial additive models is considered in this paper. B-spline is used to estimate the unknown additive nonparametric function. It is shown that despite using B-spline to estimate the unknown additive nonparametric function, the proposed estimator has an oracle property. The robustness of the weighted signed-rank approach for data with heavy-tail, contaminated errors, and data containing high-leverage points are validated via finite sample simulations. A practical application to an economic study is provided using an updated Canadian household gasoline consumption data.  相似文献   

17.
Abstract. Lasso and other regularization procedures are attractive methods for variable selection, subject to a proper choice of shrinkage parameter. Given a set of potential subsets produced by a regularization algorithm, a consistent model selection criterion is proposed to select the best one among this preselected set. The approach leads to a fast and efficient procedure for variable selection, especially in high‐dimensional settings. Model selection consistency of the suggested criterion is proven when the number of covariates d is fixed. Simulation studies suggest that the criterion still enjoys model selection consistency when d is much larger than the sample size. The simulations also show that our approach for variable selection works surprisingly well in comparison with existing competitors. The method is also applied to a real data set.  相似文献   

18.
This paper focuses on the variable selections for a varying coefficient models with missing response at random. A procedure is presented by basis function approximations with smooth-threshold estimating equations. Furthermore, the proposed method selects significant variables and estimates coefficients simultaneously avoiding the problem of solving a convex optimization, which reduced the burden of computation. Compared to existing equation based approaches, our procedure is more efficient and quick. With proper choices the regularization parameter, the resulting estimates perform an oracle property. A cross-validation for tuning parameter selection is also proposed, a numerical study confirms the performance of the proposed method.  相似文献   

19.
In this paper, we propose a new full iteration estimation method for quantile regression (QR) of the single-index model (SIM). The asymptotic properties of the proposed estimator are derived. Furthermore, we propose a variable selection procedure for the QR of SIM by combining the estimation method with the adaptive LASSO penalized method to get sparse estimation of the index parameter. The oracle properties of the variable selection method are established. Simulations with various non-normal errors are conducted to demonstrate the finite sample performance of the estimation method and the variable selection procedure. Furthermore, we illustrate the proposed method by analyzing a real data set.  相似文献   

20.
Abstract

Variable selection in finite mixture of regression (FMR) models is frequently used in statistical modeling. The majority of applications of variable selection in FMR models use a normal distribution for regression error. Such assumptions are unsuitable for a set of data containing a group or groups of observations with heavy tails and outliers. In this paper, we introduce a robust variable selection procedure for FMR models using the t distribution. With appropriate selection of the tuning parameters, the consistency and the oracle property of the regularized estimators are established. To estimate the parameters of the model, we develop an EM algorithm for numerical computations and a method for selecting tuning parameters adaptively. The parameter estimation performance of the proposed model is evaluated through simulation studies. The application of the proposed model is illustrated by analyzing a real data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号