首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A number of nonstationary models have been developed to estimate extreme events as function of covariates. A quantile regression (QR) model is a statistical approach intended to estimate and conduct inference about the conditional quantile functions. In this article, we focus on the simultaneous variable selection and parameter estimation through penalized quantile regression. We conducted a comparison of regularized Quantile Regression model with B-Splines in Bayesian framework. Regularization is based on penalty and aims to favor parsimonious model, especially in the case of large dimension space. The prior distributions related to the penalties are detailed. Five penalties (Lasso, Ridge, SCAD0, SCAD1 and SCAD2) are considered with their equivalent expressions in Bayesian framework. The regularized quantile estimates are then compared to the maximum likelihood estimates with respect to the sample size. A Markov Chain Monte Carlo (MCMC) algorithms are developed for each hierarchical model to simulate the conditional posterior distribution of the quantiles. Results indicate that the SCAD0 and Lasso have the best performance for quantile estimation according to Relative Mean Biais (RMB) and the Relative Mean-Error (RME) criteria, especially in the case of heavy distributed errors. A case study of the annual maximum precipitation at Charlo, Eastern Canada, with the Pacific North Atlantic climate index as covariate is presented.  相似文献   

2.
The penalized maximum likelihood estimator (PMLE) has been widely used for variable selection in high-dimensional data. Various penalty functions have been employed for this purpose, e.g., Lasso, weighted Lasso, or smoothly clipped absolute deviations. However, the PMLE can be very sensitive to outliers in the data, especially to outliers in the covariates (leverage points). In order to overcome this disadvantage, the usage of the penalized maximum trimmed likelihood estimator (PMTLE) is proposed to estimate the unknown parameters in a robust way. The computation of the PMTLE takes advantage of the same technology as used for PMLE but here the estimation is based on subsamples only. The breakdown point properties of the PMTLE are discussed using the notion of $d$ -fullness. The performance of the proposed estimator is evaluated in a simulation study for the classical multiple linear and Poisson linear regression models.  相似文献   

3.
Abstract

In this article, we propose a new penalized-likelihood method to conduct model selection for finite mixture of regression models. The penalties are imposed on mixing proportions and regression coefficients, and hence order selection of the mixture and the variable selection in each component can be simultaneously conducted. The consistency of order selection and the consistency of variable selection are investigated. A modified EM algorithm is proposed to maximize the penalized log-likelihood function. Numerical simulations are conducted to demonstrate the finite sample performance of the estimation procedure. The proposed methodology is further illustrated via real data analysis.  相似文献   

4.
Abstract

In this article, we study the variable selection and estimation for linear regression models with missing covariates. The proposed estimation method is almost as efficient as the popular least-squares-based estimation method for normal random errors and empirically shown to be much more efficient and robust with respect to heavy tailed errors or outliers in the responses and covariates. To achieve sparsity, a variable selection procedure based on SCAD is proposed to conduct estimation and variable selection simultaneously. The procedure is shown to possess the oracle property. To deal with the covariates missing, we consider the inverse probability weighted estimators for the linear model when the selection probability is known or unknown. It is shown that the estimator by using estimated selection probability has a smaller asymptotic variance than that with true selection probability, thus is more efficient. Therefore, the important Horvitz-Thompson property is verified for penalized rank estimator with the covariates missing in the linear model. Some numerical examples are provided to demonstrate the performance of the estimators.  相似文献   

5.
Mixture of linear regression models provide a popular treatment for modeling nonlinear regression relationship. The traditional estimation of mixture of regression models is based on Gaussian error assumption. It is well known that such assumption is sensitive to outliers and extreme values. To overcome this issue, a new class of finite mixture of quantile regressions (FMQR) is proposed in this article. Compared with the existing Gaussian mixture regression models, the proposed FMQR model can provide a complete specification on the conditional distribution of response variable for each component. From the likelihood point of view, the FMQR model is equivalent to the finite mixture of regression models based on errors following asymmetric Laplace distribution (ALD), which can be regarded as an extension to the traditional mixture of regression models with normal error terms. An EM algorithm is proposed to obtain the parameter estimates of the FMQR model by combining a hierarchical representation of the ALD. Finally, the iterated weighted least square estimation for each mixture component of the FMQR model is derived. Simulation studies are conducted to illustrate the finite sample performance of the estimation procedure. Analysis of an aphid data set is used to illustrate our methodologies.  相似文献   

6.
Analysis of massive datasets is challenging owing to limitations of computer primary memory. Composite quantile regression (CQR) is a robust and efficient estimation method. In this paper, we extend CQR to massive datasets and propose a divide-and-conquer CQR method. The basic idea is to split the entire dataset into several blocks, applying the CQR method for data in each block, and finally combining these regression results via weighted average. The proposed approach significantly reduces the required amount of primary memory, and the resulting estimate will be as efficient as if the entire data set is analysed simultaneously. Moreover, to improve the efficiency of CQR, we propose a weighted CQR estimation approach. To achieve sparsity with high-dimensional covariates, we develop a variable selection procedure to select significant parametric components and prove the method possessing the oracle property. Both simulations and data analysis are conducted to illustrate the finite sample performance of the proposed methods.  相似文献   

7.
A fully nonparametric model may not perform well or when the researcher wants to use a parametric model but the functional form with respect to a subset of the regressors or the density of the errors is not known. This becomes even more challenging when the data contain gross outliers or unusual observations. However, in practice the true covariates are not known in advance, nor is the smoothness of the functional form. A robust model selection approach through which we can choose the relevant covariates components and estimate the smoothing function may represent an appealing tool to the solution. A weighted signed-rank estimation and variable selection under the adaptive lasso for semi-parametric partial additive models is considered in this paper. B-spline is used to estimate the unknown additive nonparametric function. It is shown that despite using B-spline to estimate the unknown additive nonparametric function, the proposed estimator has an oracle property. The robustness of the weighted signed-rank approach for data with heavy-tail, contaminated errors, and data containing high-leverage points are validated via finite sample simulations. A practical application to an economic study is provided using an updated Canadian household gasoline consumption data.  相似文献   

8.
This article is concerned with the estimation problem in the semiparametric isotonic regression model when the covariates are measured with additive errors and the response is missing at random. An inverse marginal probability weighted imputation approach is developed to estimate the regression parameters and a least-square approach under monotone constraint is employed to estimate the functional component. We show that the proposed estimator of the regression parameter is root-n consistent and asymptotically normal and the isotonic estimator of the functional component, at a fixed point, is cubic root-n consistent. A simulation study is conducted to examine the finite-sample properties of the proposed estimators. A data set is used to demonstrate the proposed approach.  相似文献   

9.
Finite mixture of regression (FMR) models are aimed at characterizing subpopulation heterogeneity stemming from different sets of covariates that impact different groups in a population. We address the contemporary problem of simultaneously conducting covariate selection and determining the number of mixture components from a Bayesian perspective that can incorporate prior information. We propose a Gibbs sampling algorithm with reversible jump Markov chain Monte Carlo implementation to accomplish concurrent covariate selection and mixture component determination in FMR models. Our Bayesian approach contains innovative features compared to previously developed reversible jump algorithms. In addition, we introduce component-adaptive weighted g priors for regression coefficients, and illustrate their improved performance in covariate selection. Numerical studies show that the Gibbs sampler with reversible jump implementation performs well, and that the proposed weighted priors can be superior to non-adaptive unweighted priors.  相似文献   

10.
This paper focuses on robust estimation and variable selection for partially linear models. We combine the weighted least absolute deviation (WLAD) regression with the adaptive least absolute shrinkage and selection operator (LASSO) to achieve simultaneous robust estimation and variable selection for partially linear models. Compared with the LAD-LASSO method, the WLAD-LASSO method will resist to the heavy-tailed errors and outliers in the parametric components. In addition, we estimate the unknown smooth function by a robust local linear regression. Under some regular conditions, the theoretical properties of the proposed estimators are established. We further examine finite-sample performance of the proposed procedure by simulation studies and a real data example.  相似文献   

11.
Quantile regression has gained increasing popularity as it provides richer information than the regular mean regression, and variable selection plays an important role in the quantile regression model building process, as it improves the prediction accuracy by choosing an appropriate subset of regression predictors. Unlike the traditional quantile regression, we consider the quantile as an unknown parameter and estimate it jointly with other regression coefficients. In particular, we adopt the Bayesian adaptive Lasso for the maximum entropy quantile regression. A flat prior is chosen for the quantile parameter due to the lack of information on it. The proposed method not only addresses the problem about which quantile would be the most probable one among all the candidates, but also reflects the inner relationship of the data through the estimated quantile. We develop an efficient Gibbs sampler algorithm and show that the performance of our proposed method is superior than the Bayesian adaptive Lasso and Bayesian Lasso through simulation studies and a real data analysis.  相似文献   

12.
Variable selection in finite mixture of regression (FMR) models is frequently used in statistical modeling. The majority of applications of variable selection in FMR models use a normal distribution for regression error. Such assumptions are unsuitable for a set of data containing a group or groups of observations with asymmetric behavior. In this paper, we introduce a variable selection procedure for FMR models using the skew-normal distribution. With appropriate choice of the tuning parameters, we establish the theoretical properties of our procedure, including consistency in variable selection and the oracle property in estimation. To estimate the parameters of the model, a modified EM algorithm for numerical computations is developed. The methodology is illustrated through numerical experiments and a real data example.  相似文献   

13.
This paper considers robust variable selection in semiparametric modeling for longitudinal data with an unspecified dependence structure. First, by basis spline approximation and using a general formulation to treat mean, median, quantile and robust mean regressions in one setting, we propose a weighted M-type regression estimator, which achieves robustness against outliers in both the response and covariates directions, and can accommodate heterogeneity, and the asymptotic properties are also established. Furthermore, a penalized weighted M-type estimator is proposed, which can do estimation and select relevant nonparametric and parametric components simultaneously, and robustly. Without any specification of error distribution and intra-subject dependence structure, the variable selection method works beautifully, including consistency in variable selection and oracle property in estimation. Simulation studies also confirm our method and theories.  相似文献   

14.
We consider a linear regression model where there are group structures in covariates. The group LASSO has been proposed for group variable selections. Many nonconvex penalties such as smoothly clipped absolute deviation and minimax concave penalty were extended to group variable selection problems. The group coordinate descent (GCD) algorithm is used popularly for fitting these models. However, the GCD algorithms are hard to be applied to nonconvex group penalties due to computational complexity unless the design matrix is orthogonal. In this paper, we propose an efficient optimization algorithm for nonconvex group penalties by combining the concave convex procedure and the group LASSO algorithm. We also extend the proposed algorithm for generalized linear models. We evaluate numerical efficiency of the proposed algorithm compared to existing GCD algorithms through simulated data and real data sets.  相似文献   

15.
Penalization has been extensively adopted for variable selection in regression. In some applications, covariates have natural grouping structures, where those in the same group have correlated measurements or related functions. Under such settings, variable selection should be conducted at both the group-level and within-group-level, that is, a bi-level selection. In this study, we propose the adaptive sparse group Lasso (adSGL) method, which combines the adaptive Lasso and adaptive group Lasso (GL) to achieve bi-level selection. It can be viewed as an improved version of sparse group Lasso (SGL) and uses data-dependent weights to improve selection performance. For computation, a block coordinate descent algorithm is adopted. Simulation shows that adSGL has satisfactory performance in identifying both individual variables and groups and lower false discovery rate and mean square error than SGL and GL. We apply the proposed method to the analysis of a household healthcare expenditure data set.  相似文献   

16.
Partially linear varying coefficient models (PLVCMs) with heteroscedasticity are considered in this article. Based on composite quantile regression, we develop a weighted composite quantile regression (WCQR) to estimate the non parametric varying coefficient functions and the parametric regression coefficients. The WCQR is augmented using a data-driven weighting scheme. Moreover, the asymptotic normality of proposed estimators for both the parametric and non parametric parts are studied explicitly. In addition, by comparing the asymptotic relative efficiency theoretically and numerically, WCQR method all outperforms the CQR method and some other estimate methods. To achieve sparsity with high-dimensional covariates, we develop a variable selection procedure to select significant parametric components for the PLVCM and prove the method possessing the oracle property. Both simulations and data analysis are conducted to illustrate the finite-sample performance of the proposed methods.  相似文献   

17.
A mixture of regression models for multivariate observed variables which contextually involves a dimension reduction step through a linear factor model is proposed. The model estimation is performed via the EM-algorithm and a procedure to compute asymptotic standard errors for the parameter estimates is developed. The proposed approach is applied to the study of students satisfaction towards different aspects of their school as a function of various covariates.  相似文献   

18.
Efficient statistical inference on nonignorable missing data is a challenging problem. This paper proposes a new estimation procedure based on composite quantile regression (CQR) for linear regression models with nonignorable missing data, that is applicable even with high-dimensional covariates. A parametric model is assumed for modelling response probability, which is estimated by the empirical likelihood approach. Local identifiability of the proposed strategy is guaranteed on the basis of an instrumental variable approach. A set of data-based adaptive weights constructed via an empirical likelihood method is used to weight CQR functions. The proposed method is resistant to heavy-tailed errors or outliers in the response. An adaptive penalisation method for variable selection is proposed to achieve sparsity with high-dimensional covariates. Limiting distributions of the proposed estimators are derived. Simulation studies are conducted to investigate the finite sample performance of the proposed methodologies. An application to the ACTG 175 data is analysed.  相似文献   

19.
In the paper we consider minimisation of U-statistics with the weighted Lasso penalty and investigate their asymptotic properties in model selection and estimation. We prove that the use of appropriate weights in the penalty leads to the procedure that behaves like the oracle that knows the true model in advance, i.e. it is model selection consistent and estimates nonzero parameters with the standard rate. For the unweighted Lasso penalty, we obtain sufficient and necessary conditions for model selection consistency of estimators. The obtained results strongly based on the convexity of the loss function that is the main assumption of the paper. Our theorems can be applied to the ranking problem as well as generalised regression models. Thus, using U-statistics we can study more complex models (better describing real problems) than usually investigated linear or generalised linear models.  相似文献   

20.
We consider the problem of variable selection for a class of varying coefficient models with instrumental variables. We focus on the case that some covariates are endogenous variables, and some auxiliary instrumental variables are available. An instrumental variable based variable selection procedure is proposed by using modified smooth-threshold estimating equations (SEEs). The proposed procedure can automatically eliminate the irrelevant covariates by setting the corresponding coefficient functions as zero, and simultaneously estimate the nonzero regression coefficients by solving the smooth-threshold estimating equations. The proposed variable selection procedure avoids the convex optimization problem, and is flexible and easy to implement. Simulation studies are carried out to assess the performance of the proposed variable selection method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号