首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
This paper is concerned with selection of explanatory variables in generalized linear models (GLM). The class of GLM's is quite large and contains e.g. the ordinary linear regression, the binary logistic regression, the probit model and Poisson regression with linear or log-linear parameter structure. We show that, through an approximation of the log likelihood and a certain data transformation, the variable selection problem in a GLM can be converted into variable selection in an ordinary (unweighted) linear regression model. As a consequence no specific computer software for variable selection in GLM's is needed. Instead, some suitable variable selection program for linear regression can be used. We also present a simulation study which shows that the log likelihood approximation is very good in many practical situations. Finally, we mention briefly possible extensions to regression models outside the class of GLM's.  相似文献   

In this paper, we propose a new full iteration estimation method for quantile regression (QR) of the single-index model (SIM). The asymptotic properties of the proposed estimator are derived. Furthermore, we propose a variable selection procedure for the QR of SIM by combining the estimation method with the adaptive LASSO penalized method to get sparse estimation of the index parameter. The oracle properties of the variable selection method are established. Simulations with various non-normal errors are conducted to demonstrate the finite sample performance of the estimation method and the variable selection procedure. Furthermore, we illustrate the proposed method by analyzing a real data set.  相似文献   

Stepwise variable selection procedures are computationally inexpensive methods for constructing useful regression models for a single dependent variable. At each step a variable is entered into or deleted from the current model, based on the criterion of minimizing the error sum of squares (SSE). When there is more than one dependent variable, the situation is more complex. In this article we propose variable selection criteria for multivariate regression which generalize the univariate SSE criterion. Specifically, we suggest minimizing some function of the estimated error covariance matrix: the trace, the determinant, or the largest eigenvalue. The computations associated with these criteria may be burdensome. We develop a computational framework based on the use of the SWEEP operator which greatly reduces these calculations for stepwise variable selection in multivariate regression.  相似文献   

In this article, a new composite quantile regression estimation approach is proposed for estimating the parametric part of single-index model. We use local linear composite quantile regression (CQR) for estimating the nonparametric part of single-index model (SIM) when the error distribution is symmetrical. The weighted local linear CQR is proposed for estimating the nonparametric part of SIM when the error distribution is asymmetrical. Moreover, a new variable selection procedure is proposed for SIM. Under some regularity conditions, we establish the large sample properties of the proposed estimators. Simulation studies and a real data analysis are presented to illustrate the behavior of the proposed estimators.  相似文献   

In this paper, we propose a new estimation method for binary quantile regression and variable selection which can be implemented by an iteratively reweighted least square approach. In contrast to existing approaches, this method is computationally simple, guaranteed to converge to a unique solution and implemented with standard software packages. We demonstrate our methods using Monte-Carlo experiments and then we apply the proposed method to the widely used work trip mode choice dataset. The results indicate that the proposed estimators work well in finite samples.  相似文献   

In this article, we consider the variable selection for a class of semiparametric instrumental variable models. By combining orthogonal weighting technology and empirical likelihood method, we propose an orthogonal weighted empirical likelihood-based variable selection procedure. Under some mild conditions, the consistency and sparsity of the variable selection procedure are studied. Furthermore, some simulation studies and a real data analysis are carried out to examine the finite-sample performance of the proposed method.  相似文献   

This paper studies the outlier detection and robust variable selection problem in the linear regression model. The penalized weighted least absolute deviation (PWLAD) regression estimation method and the adaptive least absolute shrinkage and selection operator (LASSO) are combined to simultaneously achieve outlier detection, and robust variable selection. An iterative algorithm is proposed to solve the proposed optimization problem. Monte Carlo studies are evaluated the finite-sample performance of the proposed methods. The results indicate that the finite sample performance of the proposed methods performs better than that of the existing methods when there are leverage points or outliers in the response variable or explanatory variables. Finally, we apply the proposed methodology to analyze two real datasets.  相似文献   

Jing Yang  Fang Lu  Hu Yang 《Statistics》2017,51(6):1179-1199
In this paper, we develop a new estimation procedure based on quantile regression for semiparametric partially linear varying-coefficient models. The proposed estimation approach is empirically shown to be much more efficient than the popular least squares estimation method for non-normal error distributions, and almost not lose any efficiency for normal errors. Asymptotic normalities of the proposed estimators for both the parametric and nonparametric parts are established. To achieve sparsity when there exist irrelevant variables in the model, two variable selection procedures based on adaptive penalty are developed to select important parametric covariates as well as significant nonparametric functions. Moreover, both these two variable selection procedures are demonstrated to enjoy the oracle property under some regularity conditions. Some Monte Carlo simulations are conducted to assess the finite sample performance of the proposed estimators, and a real-data example is used to illustrate the application of the proposed methods.  相似文献   

A fully nonparametric model may not perform well or when the researcher wants to use a parametric model but the functional form with respect to a subset of the regressors or the density of the errors is not known. This becomes even more challenging when the data contain gross outliers or unusual observations. However, in practice the true covariates are not known in advance, nor is the smoothness of the functional form. A robust model selection approach through which we can choose the relevant covariates components and estimate the smoothing function may represent an appealing tool to the solution. A weighted signed-rank estimation and variable selection under the adaptive lasso for semi-parametric partial additive models is considered in this paper. B-spline is used to estimate the unknown additive nonparametric function. It is shown that despite using B-spline to estimate the unknown additive nonparametric function, the proposed estimator has an oracle property. The robustness of the weighted signed-rank approach for data with heavy-tail, contaminated errors, and data containing high-leverage points are validated via finite sample simulations. A practical application to an economic study is provided using an updated Canadian household gasoline consumption data.  相似文献   

Varying-coefficient models have been widely used to investigate the possible time-dependent effects of covariates when the response variable comes from normal distribution. Much progress has been made for inference and variable selection in the framework of such models. However, the identification of model structure, that is how to identify which covariates have time-varying effects and which have fixed effects, remains a challenging and unsolved problem especially when the dimension of covariates is much larger than the sample size. In this article, we consider the structural identification and variable selection problems in varying-coefficient models for high-dimensional data. Using a modified basis expansion approach and group variable selection methods, we propose a unified procedure to simultaneously identify the model structure, select important variables and estimate the coefficient curves. The unique feature of the proposed approach is that we do not have to specify the model structure in advance, therefore, it is more realistic and appropriate for real data analysis. Asymptotic properties of the proposed estimators have been derived under regular conditions. Furthermore, we evaluate the finite sample performance of the proposed methods with Monte Carlo simulation studies and a real data analysis.  相似文献   

In this article, a new efficient iteration procedure based on quantile regression is developed for single-index varying-coefficient models. The proposed estimation scheme is an extension of the full iteration procedure proposed by Carroll et al., which is different with the method adopted by Wu et al. for single-index models that a double-weighted summation is used therein. This distinguish not only be the reason that undersmoothing should be a necessary condition in our proposed procedure, but also may reduce the computational burden especially for large-sample size. The resulting estimators are shown to be robust with regardless of outliers as well as varying errors. Moreover, to achieve sparsity when there exist irrelevant variables in the index parameters, a variable selection procedure combined with adaptive LASSO penalty is developed to simultaneously select and estimate significant parameters. Theoretical properties of the obtained estimators are established under some regular conditions, and some simulation studies with various distributed errors are conducted to assess the finite sample performance of our proposed method.  相似文献   


There has been much attention on the high-dimensional linear regression models, which means the number of observations is much less than that of covariates. Considering the fact that the high dimensionality often induces the collinearity problem, in this article, we study the penalized quantile regression with the elastic net (EnetQR) that combines the strengths of the quadratic regularization and the lasso shrinkage. We investigate the weak oracle property of the EnetQR under mild conditions in the high dimensional setting. Moreover, we propose a two-step procedure, called adaptive elastic net quantile regression (AEnetQR), in which the weight vector in the second step is constructed from the EnetQR estimate in the first step. This two-step procedure is justified theoretically to possess the weak oracle property. The finite sample properties are performed through the Monte Carlo simulation and a real-data analysis.  相似文献   

Penalization has been extensively adopted for variable selection in regression. In some applications, covariates have natural grouping structures, where those in the same group have correlated measurements or related functions. Under such settings, variable selection should be conducted at both the group-level and within-group-level, that is, a bi-level selection. In this study, we propose the adaptive sparse group Lasso (adSGL) method, which combines the adaptive Lasso and adaptive group Lasso (GL) to achieve bi-level selection. It can be viewed as an improved version of sparse group Lasso (SGL) and uses data-dependent weights to improve selection performance. For computation, a block coordinate descent algorithm is adopted. Simulation shows that adSGL has satisfactory performance in identifying both individual variables and groups and lower false discovery rate and mean square error than SGL and GL. We apply the proposed method to the analysis of a household healthcare expenditure data set.  相似文献   


Spatial heterogeneity and correlation are both considered in the geographical weighted spatial autoregressive model. At present, this kind of model has aroused the attention of some scholars. For the estimation of the model, the existing research is based on the assumption that the error terms are independent and identically distributed. In this article we use a computationally simple procedure for estimating the model with spatially autoregressive disturbance terms, both the estimates of constant coefficients and variable coefficients are obtained. Finally, we give the large sample properties of the estimators under some ordinary conditions. In addition, application study of the estimation methods involved will be further explored in a separate study.  相似文献   

In this article, a robust variable selection procedure based on the weighted composite quantile regression (WCQR) is proposed. Compared with the composite quantile regression (CQR), WCQR is robust to heavy-tailed errors and outliers in the explanatory variables. For the choice of the weights in the WCQR, we employ a weighting scheme based on the principal component method. To select variables with grouping effect, we consider WCQR with SCAD-L2 penalization. Furthermore, under some suitable assumptions, the theoretical properties, including the consistency and oracle property of the estimator, are established with a diverging number of parameters. In addition, we study the numerical performance of the proposed method in the case of ultrahigh-dimensional data. Simulation studies and real examples are provided to demonstrate the superiority of our method over the CQR method when there are outliers in the explanatory variables and/or the random error is from a heavy-tailed distribution.  相似文献   

This paper considers variable and factor selection in factor analysis. We treat the factor loadings for each observable variable as a group, and introduce a weighted sparse group lasso penalty to the complete log-likelihood. The proposal simultaneously selects observable variables and latent factors of a factor analysis model in a data-driven fashion; it produces a more flexible and sparse factor loading structure than existing methods. For parameter estimation, we derive an expectation-maximization algorithm that optimizes the penalized log-likelihood. The tuning parameters of the procedure are selected by a likelihood cross-validation criterion that yields satisfactory results in various simulation settings. Simulation results reveal that the proposed method can better identify the possibly sparse structure of the true factor loading matrix with higher estimation accuracy than existing methods. A real data example is also presented to demonstrate its performance in practice.  相似文献   


In this article, we propose a new penalized-likelihood method to conduct model selection for finite mixture of regression models. The penalties are imposed on mixing proportions and regression coefficients, and hence order selection of the mixture and the variable selection in each component can be simultaneously conducted. The consistency of order selection and the consistency of variable selection are investigated. A modified EM algorithm is proposed to maximize the penalized log-likelihood function. Numerical simulations are conducted to demonstrate the finite sample performance of the estimation procedure. The proposed methodology is further illustrated via real data analysis.  相似文献   

For regression problems with grouped covariates, we adapt the idea of sparse group lasso (SGL) [10 J. Friedman, T. Hastie, and R. Tibshirani, A note on the group lasso and a sparse group lasso, Tech. Rep., Statistics Department, Stanford University, 2010. [Google Scholar]] to the framework of the sufficient dimension reduction. Assuming that the regression falls into a single-index structure, we propose a method called the sparse group sufficient dimension reduction to conduct group and within-group variable selections simultaneously without assuming a specific link function. Simulation studies show that our method is comparable to the SGL under the regular linear model setting and outperforms SGL with higher true positive rates and substantially lower false positive rates when the regression function is nonlinear. One immediate application of our method is to the gene pathway data analysis where genes naturally fall into groups (pathways). An analysis of a glioblastoma microarray data is included for illustration of our method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号