首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Random coefficient model (RCM) is a powerful statistical tool in analyzing correlated data collected from studies with different clusters or from longitudinal studies. In practice, there is a need for statistical methods that allow biomedical researchers to adjust for the measured and unmeasured covariates that might affect the regression model. This article studies two nonparametric methods dealing with auxiliary covariate data in linear random coefficient models. We demonstrate how to estimate the coefficients of the models and how to predict the random effects when the covariates are missing or mismeasured. We employ empirical estimator and kernel smoother to handle a discrete and continuous auxiliary, respectively. Simulation results show that the proposed methods perform better than an alternative method that only uses data in the validation data set and ignores the random effects in the random coefficient model.  相似文献   

2.
Abstract

In this article, we consider a panel data partially linear regression model with fixed effect and non parametric time trend function. The data can be dependent cross individuals through linear regressor and error components. Unlike the methods using non parametric smoothing technique, a difference-based method is proposed to estimate linear regression coefficients of the model to avoid bandwidth selection. Here the difference technique is employed to eliminate the non parametric function effect, not the fixed effects, on linear regressor coefficient estimation totally. Therefore, a more efficient estimator for parametric part is anticipated, which is shown to be true by the simulation results. For the non parametric component, the polynomial spline technique is implemented. The asymptotic properties of estimators for parametric and non parametric parts are presented. We also show how to select informative ones from a number of covariates in the linear part by using smoothly clipped absolute deviation-penalized estimators on a difference-based least-squares objective function, and the resulting estimators perform asymptotically as well as the oracle procedure in terms of selecting the correct model.  相似文献   

3.
Abstract

In this article, we study the variable selection and estimation for linear regression models with missing covariates. The proposed estimation method is almost as efficient as the popular least-squares-based estimation method for normal random errors and empirically shown to be much more efficient and robust with respect to heavy tailed errors or outliers in the responses and covariates. To achieve sparsity, a variable selection procedure based on SCAD is proposed to conduct estimation and variable selection simultaneously. The procedure is shown to possess the oracle property. To deal with the covariates missing, we consider the inverse probability weighted estimators for the linear model when the selection probability is known or unknown. It is shown that the estimator by using estimated selection probability has a smaller asymptotic variance than that with true selection probability, thus is more efficient. Therefore, the important Horvitz-Thompson property is verified for penalized rank estimator with the covariates missing in the linear model. Some numerical examples are provided to demonstrate the performance of the estimators.  相似文献   

4.
Partially linear varying coefficient models (PLVCMs) with heteroscedasticity are considered in this article. Based on composite quantile regression, we develop a weighted composite quantile regression (WCQR) to estimate the non parametric varying coefficient functions and the parametric regression coefficients. The WCQR is augmented using a data-driven weighting scheme. Moreover, the asymptotic normality of proposed estimators for both the parametric and non parametric parts are studied explicitly. In addition, by comparing the asymptotic relative efficiency theoretically and numerically, WCQR method all outperforms the CQR method and some other estimate methods. To achieve sparsity with high-dimensional covariates, we develop a variable selection procedure to select significant parametric components for the PLVCM and prove the method possessing the oracle property. Both simulations and data analysis are conducted to illustrate the finite-sample performance of the proposed methods.  相似文献   

5.
Variable selection over a potentially large set of covariates in a linear model is quite popular. In the Bayesian context, common prior choices can lead to a posterior expectation of the regression coefficients that is a sparse (or nearly sparse) vector with a few nonzero components, those covariates that are most important. This article extends the “global‐local” shrinkage idea to a scenario where one wishes to model multiple response variables simultaneously. Here, we have developed a variable selection method for a K‐outcome model (multivariate regression) that identifies the most important covariates across all outcomes. The prior for all regression coefficients is a mean zero normal with coefficient‐specific variance term that consists of a predictor‐specific factor (shared local shrinkage parameter) and a model‐specific factor (global shrinkage term) that differs in each model. The performance of our modeling approach is evaluated through simulation studies and a data example.  相似文献   

6.
Based on the inverse probability weight method, we, in this article, construct the empirical likelihood (EL) and penalized empirical likelihood (PEL) ratios of the parameter in the linear quantile regression model when the covariates are missing at random, in the presence and absence of auxiliary information, respectively. It is proved that the EL ratio admits a limiting Chi-square distribution. At the same time, the asymptotic normality of the maximum EL and PEL estimators of the parameter is established. Also, the variable selection of the model in the presence and absence of auxiliary information, respectively, is discussed. Simulation study and a real data analysis are done to evaluate the performance of the proposed methods.  相似文献   

7.
In this paper, a generalized partially linear model (GPLM) with missing covariates is studied and a Monte Carlo EM (MCEM) algorithm with penalized-spline (P-spline) technique is developed to estimate the regression coefficients and nonparametric function, respectively. As classical model selection procedures such as Akaike's information criterion become invalid for our considered models with incomplete data, some new model selection criterions for GPLMs with missing covariates are proposed under two different missingness mechanism, say, missing at random (MAR) and missing not at random (MNAR). The most attractive point of our method is that it is rather general and can be extended to various situations with missing observations based on EM algorithm, especially when no missing data involved, our new model selection criterions are reduced to classical AIC. Therefore, we can not only compare models with missing observations under MAR/MNAR settings, but also can compare missing data models with complete-data models simultaneously. Theoretical properties of the proposed estimator, including consistency of the model selection criterions are investigated. A simulation study and a real example are used to illustrate the proposed methodology.  相似文献   

8.
Linear regression models are useful statistical tools to analyze data sets in different fields. There are several methods to estimate the parameters of a linear regression model. These methods usually perform under normally distributed and uncorrelated errors. If error terms are correlated the Conditional Maximum Likelihood (CML) estimation method under normality assumption is often used to estimate the parameters of interest. The CML estimation method is required a distributional assumption on error terms. However, in practice, such distributional assumptions on error terms may not be plausible. In this paper, we propose to estimate the parameters of a linear regression model with autoregressive error term using Empirical Likelihood (EL) method, which is a distribution free estimation method. A small simulation study is provided to evaluate the performance of the proposed estimation method over the CML method. The results of the simulation study show that the proposed estimators based on EL method are remarkably better than the estimators obtained from CML method in terms of mean squared errors (MSE) and bias in almost all the simulation configurations. These findings are also confirmed by the results of the numerical and real data examples.  相似文献   

9.
In survival analysis, time-dependent covariates are usually present as longitudinal data collected periodically and measured with error. The longitudinal data can be assumed to follow a linear mixed effect model and Cox regression models may be used for modelling of survival events. The hazard rate of survival times depends on the underlying time-dependent covariate measured with error, which may be described by random effects. Most existing methods proposed for such models assume a parametric distribution assumption on the random effects and specify a normally distributed error term for the linear mixed effect model. These assumptions may not be always valid in practice. In this article, we propose a new likelihood method for Cox regression models with error-contaminated time-dependent covariates. The proposed method does not require any parametric distribution assumption on random effects and random errors. Asymptotic properties for parameter estimators are provided. Simulation results show that under certain situations the proposed methods are more efficient than the existing methods.  相似文献   

10.
Varying-coefficient models have been widely used to investigate the possible time-dependent effects of covariates when the response variable comes from normal distribution. Much progress has been made for inference and variable selection in the framework of such models. However, the identification of model structure, that is how to identify which covariates have time-varying effects and which have fixed effects, remains a challenging and unsolved problem especially when the dimension of covariates is much larger than the sample size. In this article, we consider the structural identification and variable selection problems in varying-coefficient models for high-dimensional data. Using a modified basis expansion approach and group variable selection methods, we propose a unified procedure to simultaneously identify the model structure, select important variables and estimate the coefficient curves. The unique feature of the proposed approach is that we do not have to specify the model structure in advance, therefore, it is more realistic and appropriate for real data analysis. Asymptotic properties of the proposed estimators have been derived under regular conditions. Furthermore, we evaluate the finite sample performance of the proposed methods with Monte Carlo simulation studies and a real data analysis.  相似文献   

11.
Although heterogeneity across individuals may be reduced when a two-state process is extended into a multi-state process, the discrepancy between the observed and the predicted for some states may still exist owing to two possibilities, unobserved mixture distribution in the initial state and the effect of measured covariates on subsequent multi-state disease progression. In the present study, we developed a mixture Markov exponential regression model to take account of the above-mentioned heterogeneity across individuals (subject-to-subject variability) with a systematic model selection based on the likelihood ratio test. The model was successfully demonstrated by an empirical example on surveillance of patients with small hepatocellular carcinoma treated by non-surgical methods. The estimated results suggested that the model with the incorporation of unobserved mixture distribution behaves better than the one without. Complete and partial effects regarding risk factors on different subsequent multi-state transitions were identified using a homogeneous Markov model. The combination of both initial mixture distribution and homogeneous Markov exponential regression model makes a significant contribution to reducing heterogeneity across individuals and over time for disease progression.  相似文献   

12.
In this paper, a new estimation procedure based on composite quantile regression and functional principal component analysis (PCA) method is proposed for the partially functional linear regression models (PFLRMs). The proposed estimation method can simultaneously estimate both the parametric regression coefficients and functional coefficient components without specification of the error distributions. The proposed estimation method is shown to be more efficient empirically for non-normal random error, especially for Cauchy error, and almost as efficient for normal random errors. Furthermore, based on the proposed estimation procedure, we use the penalized composite quantile regression method to study variable selection for parametric part in the PFLRMs. Under certain regularity conditions, consistency, asymptotic normality, and Oracle property of the resulting estimators are derived. Simulation studies and a real data analysis are conducted to assess the finite sample performance of the proposed methods.  相似文献   

13.
This paper focuses on the variable selection for semiparametric varying coefficient partially linear model when the covariates are measured with additive errors and the response is missing. An adaptive lasso estimator and the smoothly clipped absolute deviation estimator as a comparison for the parameters are proposed. With the proper selection of regularization parameter, the sampling properties including the consistency of the two procedures and the oracle properties are established. Furthermore, the algorithms and corresponding standard error formulas are discussed. A simulation study is carried out to assess the finite sample performance of the proposed methods.  相似文献   

14.
There are several procedures for fitting generalized additive models, i.e. regression models for an exponential family response where the influence of each single covariates is assumed to have unknown, potentially non-linear shape. Simulated data are used to compare a smoothing parameter optimization approach for selection of smoothness and of covariates, a stepwise approach, a mixed model approach, and a procedure based on boosting techniques. In particular it is investigated how the performance of procedures is linked to amount of information, type of response, total number of covariates, number of influential covariates, and extent of non-linearity. Measures for comparison are prediction performance, identification of influential covariates, and smoothness of fitted functions. One result is that the mixed model approach returns sparse fits with frequently over-smoothed functions, while the functions are less smooth for the boosting approach and variable selection is less strict. The other approaches are in between with respect to these measures. The boosting procedure is seen to perform very well when little information is available and/or when a large number of covariates is to be investigated. It is somewhat surprising that in scenarios with low information the fitting of a linear model, even with stepwise variable selection, has not much advantage over the fitting of an additive model when the true underlying structure is linear. In cases with more information the prediction performance of all procedures is very similar. So, in difficult data situations the boosting approach can be recommended, in others the procedures can be chosen conditional on the aim of the analysis.  相似文献   

15.
For defining a Modified Maximum Likelihood Estimate of the scale parameter of Rayleigh distribution, a hyperbolic approximation is used instead of linear approximation for a function which appears in the Maximum Likelihood equation. This estimate is shown to perform better, in the sense of accuracy and simplicity of calculation, than the one based on linear approximation for the same function. Also the estimate of the scale parameter obtained is shown to be asymptotically unbiased. Numerical computation for random samples of different sizes from Rayleigh distribution, using type I1 censoring is done and is shown to be better than that obtained by Lee et al. (1980)  相似文献   

16.
We consider fitting Emax models to the primary endpoint for a parallel group dose–response clinical trial. Such models can be difficult to fit using Maximum Likelihood if the data give little information about the maximum possible response. Consequently, we consider alternative models that can be derived as limiting cases, which can usually be fitted. Furthermore we propose two model selection procedures for choosing between the different models. These model selection procedures are compared with two model selection procedures which have previously been used. In a simulation study we find that the model selection procedure that performs best depends on the underlying true situation. One of the new model selection procedures gives what may be regarded as the most robust of the procedures.  相似文献   

17.
Partial linear varying coefficient models (PLVCM) are often considered for analysing longitudinal data for a good balance between flexibility and parsimony. The existing estimation and variable selection methods for this model are mainly built upon which subset of variables have linear or varying effect on the response is known in advance, or say, model structure is determined. However, in application, this is unreasonable. In this work, we propose a simultaneous structure estimation and variable selection method, which can do simultaneous coefficient estimation and three types of selections: varying and constant effects selection, relevant variable selection. It can be easily implemented in one step by employing a penalized M-type regression, which uses a general loss function to treat mean, median, quantile and robust mean regressions in a unified framework. Consistency in the three types of selections and oracle property in estimation are established as well. Simulation studies and real data analysis also confirm our method.  相似文献   

18.
Mixture of linear mixed-effects models has received considerable attention in longitudinal studies, including medical research, social science and economics. The inferential question of interest is often the identification of critical factors that affect the responses. We consider a Bayesian approach to select the important fixed and random effects in the finite mixture of linear mixed-effects models. To accomplish our goal, latent variables are introduced to facilitate the identification of influential fixed and random components and to classify the membership of observations in the longitudinal data. A spike-and-slab prior for the regression coefficients is adopted to sidestep the potential complications of highly collinear covariates and to handle large p and small n issues in the variable selection problems. Here we employ Markov chain Monte Carlo (MCMC) sampling techniques for posterior inferences and explore the performance of the proposed method in simulation studies, followed by an actual psychiatric data analysis concerning depressive disorder.  相似文献   

19.
In this article, we develop a robust variable selection procedure jointly for fixed and random effects in linear mixed models for longitudinal data. We propose a penalized robust estimator for both the regression coefficients and the variance of random effects based on a re-parametrization of the linear mixed models. Under some regularity conditions, we show the oracle properties of the proposed robust variable selection method. Simulation study shows the robustness of the proposed method against outliers. In the end, the proposed methods is illustrated in the analysis of a real data set.  相似文献   

20.
Ordinal regression is used for modelling an ordinal response variable as a function of some explanatory variables. The classical technique for estimating the unknown parameters of this model is Maximum Likelihood (ML). The lack of robustness of this estimator is formally shown by deriving its breakdown point and its influence function. To robustify the procedure, a weighting step is added to the Maximum Likelihood estimator, yielding an estimator with bounded influence function. We also show that the loss in efficiency due to the weighting step remains limited. A diagnostic plot based on the Weighted Maximum Likelihood estimator allows to detect outliers of different types in a single plot.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号