首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The authors propose a robust transformation linear mixed‐effects model for longitudinal continuous proportional data when some of the subjects exhibit outlying trajectories over time. It becomes troublesome when including or excluding such subjects in the data analysis results in different statistical conclusions. To robustify the longitudinal analysis using the mixed‐effects model, they utilize the multivariate t distribution for random effects or/and error terms. Estimation and inference in the proposed model are established and illustrated by a real data example from an ophthalmology study. Simulation studies show a substantial robustness gain by the proposed model in comparison to the mixed‐effects model based on Aitchison's logit‐normal approach. As a result, the data analysis benefits from the robustness of making consistent conclusions in the presence of influential outliers. The Canadian Journal of Statistics © 2009 Statistical Society of Canada  相似文献   

2.
Bayesian selection of variables is often difficult to carry out because of the challenge in specifying prior distributions for the regression parameters for all possible models, specifying a prior distribution on the model space and computations. We address these three issues for the logistic regression model. For the first, we propose an informative prior distribution for variable selection. Several theoretical and computational properties of the prior are derived and illustrated with several examples. For the second, we propose a method for specifying an informative prior on the model space, and for the third we propose novel methods for computing the marginal distribution of the data. The new computational algorithms only require Gibbs samples from the full model to facilitate the computation of the prior and posterior model probabilities for all possible models. Several properties of the algorithms are also derived. The prior specification for the first challenge focuses on the observables in that the elicitation is based on a prior prediction y 0 for the response vector and a quantity a 0 quantifying the uncertainty in y 0. Then, y 0 and a 0 are used to specify a prior for the regression coefficients semi-automatically. Examples using real data are given to demonstrate the methodology.  相似文献   

3.
Variable selection is an important issue in all regression analysis, and in this article, we investigate the simultaneous variable selection in joint location and scale models of the skew-t-normal distribution when the dataset under consideration involves heavy tail and asymmetric outcomes. We propose a unified penalized likelihood method which can simultaneously select significant variables in the location and scale models. Furthermore, the proposed variable selection method can simultaneously perform parameter estimation and variable selection in the location and scale models. With appropriate selection of the tuning parameters, we establish the consistency and the oracle property of the regularized estimators. These estimators are compared by simulation studies.  相似文献   

4.
A regression model with skew-normal errors provides a useful extension for ordinary normal regression models when the data set under consideration involves asymmetric outcomes. Variable selection is an important issue in all regression analyses, and in this paper, we investigate the simultaneously variable selection in joint location and scale models of the skew-normal distribution. We propose a unified penalized likelihood method which can simultaneously select significant variables in the location and scale models. Furthermore, the proposed variable selection method can simultaneously perform parameter estimation and variable selection in the location and scale models. With appropriate selection of the tuning parameters, we establish the consistency and the oracle property of the regularized estimators. Simulation studies and a real example are used to illustrate the proposed methodologies.  相似文献   

5.
The results of analyzing experimental data using a parametric model may heavily depend on the chosen model for regression and variance functions, moreover also on a possibly underlying preliminary transformation of the variables. In this paper we propose and discuss a complex procedure which consists in a simultaneous selection of parametric regression and variance models from a relatively rich model class and of Box-Cox variable transformations by minimization of a cross-validation criterion. For this it is essential to introduce modifications of the standard cross-validation criterion adapted to each of the following objectives: 1. estimation of the unknown regression function, 2. prediction of future values of the response variable, 3. calibration or 4. estimation of some parameter with a certain meaning in the corresponding field of application. Our idea of a criterion oriented combination of procedures (which usually if applied, then in an independent or sequential way) is expected to lead to more accurate results. We show how the accuracy of the parameter estimators can be assessed by a “moment oriented bootstrap procedure", which is an essential modification of the “wild bootstrap” of Härdle and Mammen by use of more accurate variance estimates. This new procedure and its refinement by a bootstrap based pivot (“double bootstrap”) is also used for the construction of confidence, prediction and calibration intervals. Programs written in Splus which realize our strategy for nonlinear regression modelling and parameter estimation are described as well. The performance of the selected model is discussed, and the behaviour of the procedures is illustrated, e.g., by an application in radioimmunological assay.  相似文献   

6.
The authors look into the problem of estimating regression functions that exhibit jump irregularities in the first derivative. They investigate the behaviour of the bias in the local linear fit and show the superior performance of appropriate one‐sided versions of the local linear fit near such irregularities. They then propose an improved estimation procedure based on data‐driven selection of a conventional or one‐sided local linear fit according to a residual sum of squares type of criterion. The authors provide theoretical results and illustrate the method both on simulated and real‐life data examples. The Canadian Journal of Statistics 37: 453–475; 2009 © 2009 Statistical Society of Canada  相似文献   

7.
We study variable selection in quantile regression with multiple responses. Instead of applying conventional penalized quantile regression to each response separately, it is desired to solve them simultaneously when the sparsity patterns of the regression coefficients for different responses are similar, which is often the case in practice. In this paper, we propose employing a hierarchical penalty that enables us to detect a common sparsity pattern shared between different responses as well as additional sparsity patterns within the selected variables. We establish the oracle property of the proposed method and demonstrate it offers better performance than existing approaches.  相似文献   

8.
Regularization methods for simultaneous variable selection and coefficient estimation have been shown to be effective in quantile regression in improving the prediction accuracy. In this article, we propose the Bayesian bridge for variable selection and coefficient estimation in quantile regression. A simple and efficient Gibbs sampling algorithm was developed for posterior inference using a scale mixture of uniform representation of the Bayesian bridge prior. This is the first work to discuss regularized quantile regression with the bridge penalty. Both simulated and real data examples show that the proposed method often outperforms quantile regression without regularization, lasso quantile regression, and Bayesian lasso quantile regression.  相似文献   

9.
Although regression estimates are quite robust to slight departure from normality, symmetric prediction intervals assuming normality can be highly unsatisfactory and problematic if the residuals have a skewed distribution. For data with distributions outside the class covered by the Generalized Linear Model, a common way to handle non-normality is to transform the response variable. Unfortunately, transforming the response variable often destroys the theoretical or empirical functional relationship connecting the mean of the response variable to the explanatory variables established on the original scale. Further complication arises if a single transformation cannot both stabilize variance and attain normality. Furthermore, practitioners also find the interpretation of highly transformed data not obvious and often prefer an analysis on the original scale. The present paper presents an alternative approach for handling simultaneously heteroscedasticity and non-normality without resorting to data transformation. Unlike classical approaches, the proposed modeling allows practitioners to formulate the mean and variance relationships directly on the original scale, making data interpretation considerably easier. The modeled variance relationship and form of non-normality in the proposed approach can be easily examined through a certain function of the standardized residuals. The proposed method is seen to remain consistent for estimating the regression parameters even if the variance function is misspecified. The method along with some model checking techniques is illustrated with a real example.  相似文献   

10.
Tree‐based methods are frequently used in studies with censored survival time. Their structure and ease of interpretability make them useful to identify prognostic factors and to predict conditional survival probabilities given an individual's covariates. The existing methods are tailor‐made to deal with a survival time variable that is measured continuously. However, survival variables measured on a discrete scale are often encountered in practice. The authors propose a new tree construction method specifically adapted to such discrete‐time survival variables. The splitting procedure can be seen as an extension, to the case of right‐censored data, of the entropy criterion for a categorical outcome. The selection of the final tree is made through a pruning algorithm combined with a bootstrap correction. The authors also present a simple way of potentially improving the predictive performance of a single tree through bagging. A simulation study shows that single trees and bagged‐trees perform well compared to a parametric model. A real data example investigating the usefulness of personality dimensions in predicting early onset of cigarette smoking is presented. The Canadian Journal of Statistics 37: 17‐32; 2009 © 2009 Statistical Society of Canada  相似文献   

11.
We consider variable selection in linear regression of geostatistical data that arise often in environmental and ecological studies. A penalized least squares procedure is studied for simultaneous variable selection and parameter estimation. Various penalty functions are considered including smoothly clipped absolute deviation. Asymptotic properties of penalized least squares estimates, particularly the oracle properties, are established, under suitable regularity conditions imposed on a random field model for the error process. Moreover, computationally feasible algorithms are proposed for estimating regression coefficients and their standard errors. Finite‐sample properties of the proposed methods are investigated in a simulation study and comparison is made among different penalty functions. The methods are illustrated by an ecological dataset of landcover in Wisconsin. The Canadian Journal of Statistics 37: 607–624; 2009 © 2009 Statistical Society of Canada  相似文献   

12.
One of the standard variable selection procedures in multiple linear regression is to use a penalisation technique in least‐squares (LS) analysis. In this setting, many different types of penalties have been introduced to achieve variable selection. It is well known that LS analysis is sensitive to outliers, and consequently outliers can present serious problems for the classical variable selection procedures. Since rank‐based procedures have desirable robustness properties compared to LS procedures, we propose a rank‐based adaptive lasso‐type penalised regression estimator and a corresponding variable selection procedure for linear regression models. The proposed estimator and variable selection procedure are robust against outliers in both response and predictor space. Furthermore, since rank regression can yield unstable estimators in the presence of multicollinearity, in order to provide inference that is robust against multicollinearity, we adjust the penalty term in the adaptive lasso function by incorporating the standard errors of the rank estimator. The theoretical properties of the proposed procedures are established and their performances are investigated by means of simulations. Finally, the estimator and variable selection procedure are applied to the Plasma Beta‐Carotene Level data set.  相似文献   

13.
The penalized logistic regression (PLR) is a powerful statistical tool for classification. It has been commonly used in many practical problems. Despite its success, since the loss function of the PLR is unbounded, resulting classifiers can be sensitive to outliers. To build more robust classifiers, we propose the robust PLR (RPLR) which uses truncated logistic loss functions, and suggest three schemes to estimate conditional class probabilities. Connections of the RPLR with some other existing work on robust logistic regression have been discussed. Our theoretical results indicate that the RPLR is Fisher consistent and more robust to outliers. Moreover, we develop estimated generalized approximate cross validation (EGACV) for the tuning parameter selection. Through numerical examples, we demonstrate that truncating the loss function indeed yields better performance in terms of classification accuracy and class probability estimation.  相似文献   

14.
In this article, we consider the problem of selecting functional variables using the L1 regularization in a functional linear regression model with a scalar response and functional predictors, in the presence of outliers. Since the LASSO is a special case of the penalized least-square regression with L1 penalty function, it suffers from the heavy-tailed errors and/or outliers in data. Recently, Least Absolute Deviation (LAD) and the LASSO methods have been combined (the LAD-LASSO regression method) to carry out robust parameter estimation and variable selection simultaneously for a multiple linear regression model. However, variable selection of the functional predictors based on LASSO fails since multiple parameters exist for a functional predictor. Therefore, group LASSO is used for selecting functional predictors since group LASSO selects grouped variables rather than individual variables. In this study, we propose a robust functional predictor selection method, the LAD-group LASSO, for a functional linear regression model with a scalar response and functional predictors. We illustrate the performance of the LAD-group LASSO on both simulated and real data.  相似文献   

15.
A challenging problem in the analysis of high-dimensional data is variable selection. In this study, we describe a bootstrap based technique for selecting predictors in partial least-squares regression (PLSR) and principle component regression (PCR) in high-dimensional data. Using a bootstrap-based technique for significance tests of the regression coefficients, a subset of the original variables can be selected to be included in the regression, thus obtaining a more parsimonious model with smaller prediction errors. We compare the bootstrap approach with several variable selection approaches (jack-knife and sparse formulation-based methods) on PCR and PLSR in simulation and real data.  相似文献   

16.
In this article, we develop a robust variable selection procedure jointly for fixed and random effects in linear mixed models for longitudinal data. We propose a penalized robust estimator for both the regression coefficients and the variance of random effects based on a re-parametrization of the linear mixed models. Under some regularity conditions, we show the oracle properties of the proposed robust variable selection method. Simulation study shows the robustness of the proposed method against outliers. In the end, the proposed methods is illustrated in the analysis of a real data set.  相似文献   

17.
Abstract

In this article, we propose a new penalized-likelihood method to conduct model selection for finite mixture of regression models. The penalties are imposed on mixing proportions and regression coefficients, and hence order selection of the mixture and the variable selection in each component can be simultaneously conducted. The consistency of order selection and the consistency of variable selection are investigated. A modified EM algorithm is proposed to maximize the penalized log-likelihood function. Numerical simulations are conducted to demonstrate the finite sample performance of the estimation procedure. The proposed methodology is further illustrated via real data analysis.  相似文献   

18.
Conventional methods apply symmetric prior distributions such as a normal distribution or a Laplace distribution for regression coefficients, which may be suitable for median regression and exhibit no robustness to outliers. This work develops a quantile regression on linear panel data model without heterogeneity from a Bayesian point of view, i.e. upon a location-scale mixture representation of the asymmetric Laplace error distribution, and provides how the posterior distribution is summarized using Markov chain Monte Carlo methods. Applying this approach to the 1970 British Cohort Study (BCS) data, it finds that a different maternal health problem has different influence on child's worrying status at different quantiles. In addition, applying stochastic search variable selection for maternal health problems to the 1970 BCS data, it finds that maternal nervous breakdown, among the 25 maternal health problems, contributes most to influence the child's worrying status.  相似文献   

19.
When confronted with multiple covariates and a response variable, analysts sometimes apply a variable‐selection algorithm to the covariate‐response data to identify a subset of covariates potentially associated with the response, and then wish to make inferences about parameters in a model for the marginal association between the selected covariates and the response. If an independent data set were available, the parameters of interest could be estimated by using standard inference methods to fit the postulated marginal model to the independent data set. However, when applied to the same data set used by the variable selector, standard (“naive”) methods can lead to distorted inferences. The authors develop testing and interval estimation methods for parameters reflecting the marginal association between the selected covariates and response variable, based on the same data set used for variable selection. They provide theoretical justification for the proposed methods, present results to guide their implementation, and use simulations to assess and compare their performance to a sample‐splitting approach. The methods are illustrated with data from a recent AIDS study. The Canadian Journal of Statistics 37: 625–644; 2009 © 2009 Statistical Society of Canada  相似文献   

20.
To enhance modeling flexibility, the authors propose a nonparametric hazard regression model, for which the ordinary and weighted least squares estimation and inference procedures are studied. The proposed model does not assume any parametric specifications on the covariate effects, which is suitable for exploring the nonlinear interactions between covariates, time and some exposure variable. The authors propose the local ordinary and weighted least squares estimators for the varying‐coefficient functions and establish the corresponding asymptotic normality properties. Simulation studies are conducted to empirically examine the finite‐sample performance of the new methods, and a real data example from a recent breast cancer study is used as an illustration. The Canadian Journal of Statistics 37: 659–674; 2009 © 2009 Statistical Society of Canada  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号