期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Least squares estimation of regression parameters in mixed effects models with unmeasured covariates

Jun Shao Mari Palta Roger P. Qu 《统计学通讯:理论与方法》2013,42(6):1487-1501

We consider mixed effects models for longitudinal, repeated measures or clustered data. Unmeasured or omitted covariates in such models may be correlated with the included covanates, and create model violations when not taken into account. Previous research and experience with longitudinal data sets suggest a general form of model which should be considered when omitted covariates are likely, such as in observational studies. We derive the marginal model between the response variable and included covariates, and consider model fitting using the ordinary and weighted least squares methods, which require simple non-iterative computation and no assumptions on the distribution of random covariates or error terms, Asymptotic properties of the least squares estimators are also discussed. The results shed light on the structure of least squares estimators in mixed effects models, and provide large sample procedures for statistical inference and prediction based on the marginal model. We present an example of the relationship between fluid intake and output in very low birth weight infants, where the model is found to have the assumed structure. 相似文献

2.

Evaluation of Bayesian multiple stage estimation under spatial CAR model variants

Daniel R. Baer Andrew B. Lawson 《Journal of Statistical Computation and Simulation》2019,89(1):98-144

In this study, an evaluation of Bayesian hierarchical models is made based on simulation scenarios to compare single-stage and multi-stage Bayesian estimations. Simulated datasets of lung cancer disease counts for men aged 65 and older across 44 wards in the London Health Authority were analysed using a range of spatially structured random effect components. The goals of this study are to determine which of these single-stage models perform best given a certain simulating model, how estimation methods (single- vs. multi-stage) compare in yielding posterior estimates of fixed effects in the presence of spatially structured random effects, and finally which of two spatial prior models – the Leroux or ICAR model, perform best in a multi-stage context under different assumptions concerning spatial correlation. Among the fitted single-stage models without covariates, we found that when there is low amount of variability in the distribution of disease counts, the BYM model is relatively robust to misspecification in terms of DIC, while the Leroux model is the least robust to misspecification. When these models were fit to data generated from models with covariates, we found that when there was one set of covariates – either spatially correlated or non-spatially correlated, changing the values of the fixed coefficients affected the ability of either the Leroux or ICAR model to fit the data well in terms of DIC. When there were multiple sets of spatially correlated covariates in the simulating model, however, we could not distinguish the goodness of fit to the data between these single-stage models. We found that the multi-stage modelling process via the Leroux and ICAR models generally reduced the variance of the posterior estimated fixed effects for data generated from models with covariates and a UH term compared to analogous single-stage models. Finally, we found the multi-stage Leroux model compares favourably to the multi-stage ICAR model in terms of DIC. We conclude that the mutli-stage Leroux model should be seriously considered in applications of Bayesian disease mapping when an investigator desires to fit a model with both fixed effects and spatially structured random effects to Poisson count data. 相似文献

3.

Structural identification and variable selection in high-dimensional varying-coefficient models

Yuping Chen Wingkam Fung 《Journal of nonparametric statistics》2017,29(2):258-279

Varying-coefficient models have been widely used to investigate the possible time-dependent effects of covariates when the response variable comes from normal distribution. Much progress has been made for inference and variable selection in the framework of such models. However, the identification of model structure, that is how to identify which covariates have time-varying effects and which have fixed effects, remains a challenging and unsolved problem especially when the dimension of covariates is much larger than the sample size. In this article, we consider the structural identification and variable selection problems in varying-coefficient models for high-dimensional data. Using a modified basis expansion approach and group variable selection methods, we propose a unified procedure to simultaneously identify the model structure, select important variables and estimate the coefficient curves. The unique feature of the proposed approach is that we do not have to specify the model structure in advance, therefore, it is more realistic and appropriate for real data analysis. Asymptotic properties of the proposed estimators have been derived under regular conditions. Furthermore, we evaluate the finite sample performance of the proposed methods with Monte Carlo simulation studies and a real data analysis. 相似文献

4.

Analysis of two-phase sampling data with semiparametric additive hazards models

Yanqing Sun Xiyuan Qian Qiong Shou Peter B. Gilbert 《Lifetime data analysis》2017,23(3):377-399

Under the case-cohort design introduced by Prentice (Biometrica 73:1–11, 1986), the covariate histories are ascertained only for the subjects who experience the event of interest (i.e., the cases) during the follow-up period and for a relatively small random sample from the original cohort (i.e., the subcohort). The case-cohort design has been widely used in clinical and epidemiological studies to assess the effects of covariates on failure times. Most statistical methods developed for the case-cohort design use the proportional hazards model, and few methods allow for time-varying regression coefficients. In addition, most methods disregard data from subjects outside of the subcohort, which can result in inefficient inference. Addressing these issues, this paper proposes an estimation procedure for the semiparametric additive hazards model with case-cohort/two-phase sampling data, allowing the covariates of interest to be missing for cases as well as for non-cases. A more flexible form of the additive model is considered that allows the effects of some covariates to be time varying while specifying the effects of others to be constant. An augmented inverse probability weighted estimation procedure is proposed. The proposed method allows utilizing the auxiliary information that correlates with the phase-two covariates to improve efficiency. The asymptotic properties of the proposed estimators are established. An extensive simulation study shows that the augmented inverse probability weighted estimation is more efficient than the widely adopted inverse probability weighted complete-case estimation method. The method is applied to analyze data from a preventive HIV vaccine efficacy trial. 相似文献

5.

Non-penalty shrinkage estimation of random effect models for longitudinal data with AR(1) errors

Le An Lac 《Journal of Statistical Computation and Simulation》2018,88(16):3230-3247

In this paper, we consider the non-penalty shrinkage estimation method of random effect models with autoregressive errors for longitudinal data when there are many covariates and some of them may not be active for the response variable. In observational studies, subjects are followed over equally or unequally spaced visits to determine the continuous response and whether the response is associated with the risk factors/covariates. Measurements from the same subject are usually more similar to each other and thus are correlated with each other but not with observations of other subjects. To analyse this data, we consider a linear model that contains both random effects across subjects and within-subject errors that follows autoregressive structure of order 1 (AR(1)). Considering the subject-specific random effect as a nuisance parameter, we use two competing models, one includes all the covariates and the other restricts the coefficients based on the auxiliary information. We consider the non-penalty shrinkage estimation strategy that shrinks the unrestricted estimator in the direction of the restricted estimator. We discuss the asymptotic properties of the shrinkage estimators using the notion of asymptotic biases and risks. A Monte Carlo simulation study is conducted to examine the relative performance of the shrinkage estimators with the unrestricted estimator when the shrinkage dimension exceeds two. We also numerically compare the performance of the shrinkage estimators to that of the LASSO estimator. A longitudinal CD4 cell count data set will be used to illustrate the usefulness of shrinkage and LASSO estimators. 相似文献

6.

Bayesian inference for generalized additive mixed models based on Markov random field priors 总被引：9，自引：0，他引：9

Ludwig Fahrmeir & Stefan Lang 《Journal of the Royal Statistical Society. Series C, Applied statistics》2001,50(2):201-220

Most regression problems in practice require flexible semiparametric forms of the predictor for modelling the dependence of responses on covariates. Moreover, it is often necessary to add random effects accounting for overdispersion caused by unobserved heterogeneity or for correlation in longitudinal or spatial data. We present a unified approach for Bayesian inference via Markov chain Monte Carlo simulation in generalized additive and semiparametric mixed models. Different types of covariates, such as the usual covariates with fixed effects, metrical covariates with non-linear effects, unstructured random effects, trend and seasonal components in longitudinal data and spatial covariates, are all treated within the same general framework by assigning appropriate Markov random field priors with different forms and degrees of smoothness. We applied the approach in several case-studies and consulting cases, showing that the methods are also computationally feasible in problems with many covariates and large data sets. In this paper, we choose two typical applications. 相似文献

7.

Semi-parametric survival analysis via Dirichlet process mixtures of the First Hitting Time model

Race Jonathan A. Pennell Michael L. 《Lifetime data analysis》2021,27(1):177-194

Time-to-event data often violate the proportional hazards assumption inherent in the popular Cox regression model. Such violations are especially common in the sphere of biological and medical data where latent heterogeneity due to unmeasured covariates or time varying effects are common. A variety of parametric survival models have been proposed in the literature which make more appropriate assumptions on the hazard function, at least for certain applications. One such model is derived from the First Hitting Time (FHT) paradigm which assumes that a subject’s event time is determined by a latent stochastic process reaching a threshold value. Several random effects specifications of the FHT model have also been proposed which allow for better modeling of data with unmeasured covariates. While often appropriate, these methods often display limited flexibility due to their inability to model a wide range of heterogeneities. To address this issue, we propose a Bayesian model which loosens assumptions on the mixing distribution inherent in the random effects FHT models currently in use. We demonstrate via simulation study that the proposed model greatly improves both survival and parameter estimation in the presence of latent heterogeneity. We also apply the proposed methodology to data from a toxicology/carcinogenicity study which exhibits nonproportional hazards and contrast the results with both the Cox model and two popular FHT models.

相似文献

8.

Estimating multiple-membership logit models with mixed effects: indirect inference versus data cloning

Anna Gottard Giorgio Calzolari 《Journal of Statistical Computation and Simulation》2017,87(12):2334-2348

Multiple-membership logit models with random effects are models for clustered binary data, where each statistical unit can belong to more than one group. The likelihood function of these models is analytically intractable. We propose two different approaches for parameter estimation: indirect inference and data cloning (DC). The former is a non-likelihood-based method which uses an auxiliary model to select reasonable estimates. We propose an auxiliary model with the same dimension of parameter space as the target model, which is particularly convenient to reach good estimates very fast. The latter method computes maximum likelihood estimates through the posterior distribution of an adequate Bayesian model, fitted to cloned data. We implement a DC algorithm specifically for multiple-membership models. A Monte Carlo experiment compares the two methods on simulated data. For further comparison, we also report Bayesian posterior mean and Integrated Nested Laplace Approximation hybrid DC estimates. Simulations show a negligible loss of efficiency for the indirect inference estimator, compensated by a relevant computational gain. The approaches are then illustrated with two real examples on matched paired data. 相似文献

9.

Additive hazards regression of current status data with auxiliary covariates

Yanqin Feng Yuan Dong 《统计学通讯:理论与方法》2017,46(21):10657-10671

This paper discusses the regression analysis of current status failure time data arising from the additive hazards model with auxiliary covariates. As often occurs in practice, it is impossible or impractical to measure the exact magnitude of covariates for all subjects in a study. To compensate the missing information, some auxiliary covariates are utilized instead. We propose two easy-to-implement procedures for estimation of regression parameters by making use of auxiliary information. The asymptotic properties of the resulting estimators are established and extensive numerical studies indicate that both procedures work well in practice. 相似文献

10.

An index of local sensitivity to non-ignorability for parametric survival models with potential non-random missing covariate: an application to the SEER cancer registry data

S. Eftekhari Mahabadi 《Journal of applied statistics》2012,39(11):2327-2348

Several survival regression models have been developed to assess the effects of covariates on failure times. In various settings, including surveys, clinical trials and epidemiological studies, missing data may often occur due to incomplete covariate data. Most existing methods for lifetime data are based on the assumption of missing at random (MAR) covariates. However, in many substantive applications, it is important to assess the sensitivity of key model inferences to the MAR assumption. The index of sensitivity to non-ignorability (ISNI) is a local sensitivity tool to measure the potential sensitivity of key model parameters to small departures from the ignorability assumption, needless of estimating a complicated non-ignorable model. We extend this sensitivity index to evaluate the impact of a covariate that is potentially missing, not at random in survival analysis, using parametric survival models. The approach will be applied to investigate the impact of missing tumor grade on post-surgical mortality outcomes in individuals with pancreas-head cancer in the Surveillance, Epidemiology, and End Results data set. For patients suffering from cancer, tumor grade is an important risk factor. Many individuals in these data with pancreas-head cancer have missing tumor grade information. Our ISNI analysis shows that the magnitude of effect for most covariates (with significant effect on the survival time distribution), specifically surgery and tumor grade as some important risk factors in cancer studies, highly depends on the missing mechanism assumption of the tumor grade. Also a simulation study is conducted to evaluate the performance of the proposed index in detecting sensitivity of key model parameters. 相似文献

11.

Modelling Survival Events with Longitudinal Covariates Measured with Error

Hongsheng Dai Jianxin Pan Yanchun Bao 《统计学通讯:理论与方法》2013,42(21):3819-3837

In survival analysis, time-dependent covariates are usually present as longitudinal data collected periodically and measured with error. The longitudinal data can be assumed to follow a linear mixed effect model and Cox regression models may be used for modelling of survival events. The hazard rate of survival times depends on the underlying time-dependent covariate measured with error, which may be described by random effects. Most existing methods proposed for such models assume a parametric distribution assumption on the random effects and specify a normally distributed error term for the linear mixed effect model. These assumptions may not be always valid in practice. In this article, we propose a new likelihood method for Cox regression models with error-contaminated time-dependent covariates. The proposed method does not require any parametric distribution assumption on random effects and random errors. Asymptotic properties for parameter estimators are provided. Simulation results show that under certain situations the proposed methods are more efficient than the existing methods. 相似文献

12.

Bayesian Analysis of Nonlinear Reproductive Dispersion Mixed Models for Longitudinal Data with Nonignorable Missing Covariates

Nian-Sheng Tang Hui Zhao 《统计学通讯:模拟与计算》2013,42(6):1265-1287

This article proposes a Bayesian approach, which can simultaneously obtain the Bayesian estimates of unknown parameters and random effects, to analyze nonlinear reproductive dispersion mixed models (NRDMMs) for longitudinal data with nonignorable missing covariates and responses. The logistic regression model is employed to model the missing data mechanisms for missing covariates and responses. A hybrid sampling procedure combining the Gibber sampler and the Metropolis-Hastings algorithm is presented to draw observations from the conditional distributions. Because missing data mechanism is not testable, we develop the logarithm of the pseudo-marginal likelihood, deviance information criterion, the Bayes factor, and the pseudo-Bayes factor to compare several competing missing data mechanism models in the current considered NRDMMs with nonignorable missing covaraites and responses. Three simulation studies and a real example taken from the paediatric AIDS clinical trial group ACTG are used to illustrate the proposed methodologies. Empirical results show that our proposed methods are effective in selecting missing data mechanism models. 相似文献

13.

Gaussian models for degradation processes-part I: Methods for the analysis of biomarker data

Kjell A. Doksum Sharon-Lise T. Normand 《Lifetime data analysis》1995,1(2):131-144

We present two stochastic models that describe the relationship between biomarker process values at random time points, event times, and a vector of covariates. In both models the biomarker processes are degradation processes that represent the decay of systems over time. In the first model the biomarker process is a Wiener process whose drift is a function of the covariate vector. In the second model the biomarker process is taken to be the difference between a stationary Gaussian process and a time drift whose drift parameter is a function of the covariates. For both models we present statistical methods for estimation of the regression coefficients. The first model is useful for predicting the residual time from study entry to the time a critical boundary is reached while the second model is useful for predicting the latency time from the infection until the time the presence of the infection is detected. We present our methods principally in the context of conducting inference in a population of HIV infected individuals. 相似文献

14.

Semiparametric models of longitudinal and time-to-event data with applications to HIV viral dynamics and CD4 counts

Xiaobing Zhao Xian Zhou 《Journal of applied statistics》2015,42(11):2461-2477

We propose a semiparametric approach based on proportional hazards and copula method to jointly model longitudinal outcomes and the time-to-event. The dependence between the longitudinal outcomes on the covariates is modeled by a copula-based times series, which allows non-Gaussian random effects and overcomes the limitation of the parametric assumptions in existing linear and nonlinear random effects models. A modified partial likelihood method using estimated covariates at failure times is employed to draw statistical inference. The proposed model and method are applied to analyze a set of progression to AIDS data in a study of the association between the human immunodeficiency virus viral dynamics and the time trend in the CD4/CD8 ratio with measurement errors. Simulations are also reported to evaluate the proposed model and method. 相似文献

15.

Computations via Auxiliary Random Functions for Survival Models

PURUSHOTTAM W. LAUD PAUL DAMIEN STEPHEN G. WALKER 《Scandinavian Journal of Statistics》2006,33(2):219-226

Abstract. A new simulation method, auxiliary random functions is introduced. When used within a Gibbs sampler, this method enables a unified treatment of exact, right-censored, left-censored, left-truncated and interval censored data, with and without covariates in survival models. The models and methods are exemplified via illustrative analysis. 相似文献

16.

Covariate Decomposition Methods for Longitudinal Missing‐at‐Random Data and Predictors Associated with Subject‐Specific Effects

John M. Neuhaus Charles E. McCulloch 《Australian & New Zealand Journal of Statistics》2014,56(4):331-345

Investigators often gather longitudinal data to assess changes in responses over time within subjects and to relate these changes to within‐subject changes in predictors. Missing data are common in such studies and predictors can be correlated with subject‐specific effects. Maximum likelihood methods for generalized linear mixed models provide consistent estimates when the data are ‘missing at random’ (MAR) but can produce inconsistent estimates in settings where the random effects are correlated with one of the predictors. On the other hand, conditional maximum likelihood methods (and closely related maximum likelihood methods that partition covariates into between‐ and within‐cluster components) provide consistent estimation when random effects are correlated with predictors but can produce inconsistent covariate effect estimates when data are MAR. Using theory, simulation studies, and fits to example data this paper shows that decomposition methods using complete covariate information produce consistent estimates. In some practical cases these methods, that ostensibly require complete covariate information, actually only involve the observed covariates. These results offer an easy‐to‐use approach to simultaneously protect against bias from both cluster‐level confounding and MAR missingness in assessments of change. 相似文献

17.

Inference in generalized additive mixed modelsby using smoothing splines

X. Lin & D. Zhang 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1999,61(2):381-400

Generalized additive mixed models are proposed for overdispersed and correlated data, which arise frequently in studies involving clustered, hierarchical and spatial designs. This class of models allows flexible functional dependence of an outcome variable on covariates by using nonparametric regression, while accounting for correlation between observations by using random effects. We estimate nonparametric functions by using smoothing splines and jointly estimate smoothing parameters and variance components by using marginal quasi-likelihood. Because numerical integration is often required by maximizing the objective functions, double penalized quasi-likelihood is proposed to make approximate inference. Frequentist and Bayesian inferences are compared. A key feature of the method proposed is that it allows us to make systematic inference on all model components within a unified parametric mixed model framework and can be easily implemented by fitting a working generalized linear mixed model by using existing statistical software. A bias correction procedure is also proposed to improve the performance of double penalized quasi-likelihood for sparse data. We illustrate the method with an application to infectious disease data and we evaluate its performance through simulation. 相似文献

18.

A semiparametric additive rates model for recurrent event data

Schaubel DE Zeng D Cai J 《Lifetime data analysis》2006,12(4):389-406

Recurrent event data often arise in biomedical studies, with examples including hospitalizations, infections, and treatment failures. In observational studies, it is often of interest to estimate the effects of covariates on the marginal recurrent event rate. The majority of existing rate regression methods assume multiplicative covariate effects. We propose a semiparametric model for the marginal recurrent event rate, wherein the covariates are assumed to add to the unspecified baseline rate. Covariate effects are summarized by rate differences, meaning that the absolute effect on the rate function can be determined from the regression coefficient alone. We describe modifications of the proposed method to accommodate a terminating event (e.g., death). Proposed estimators of the regression parameters and baseline rate are shown to be consistent and asymptotically Gaussian. Simulation studies demonstrate that the asymptotic approximations are accurate in finite samples. The proposed methods are applied to a state-wide kidney transplant data set. 相似文献

19.

A flexible approach for multivariate mixed-effects models with non-ignorable missing values

《Journal of Statistical Computation and Simulation》2012,82(18):3727-3743

We propose a flexible model approach for the distribution of random effects when both response variables and covariates have non-ignorable missing values in a longitudinal study. A Bayesian approach is developed with a choice of nonparametric prior for the distribution of random effects. We apply the proposed method to a real data example from a national long-term survey by Statistics Canada. We also design simulation studies to further check the performance of the proposed approach. The result of simulation studies indicates that the proposed approach outperforms the conventional approach with normality assumption when the heterogeneity in random effects distribution is salient. 相似文献

20.

Bayesian Methods for Missing Covariates in Cure Rate Models

Chen MH Ibrahim JG Lipsitz SR 《Lifetime data analysis》2002,8(2):117-146

We propose methods for Bayesian inference for missing covariate data with a novel class of semi-parametric survival models with a cure fraction. We allow the missing covariates to be either categorical or continuous and specify a parametric distribution for the covariates that is written as a sequence of one dimensional conditional distributions. We assume that the missing covariates are missing at random (MAR) throughout. We propose an informative class of joint prior distributions for the regression coefficients and the parameters arising from the covariate distributions. The proposed class of priors are shown to be useful in recovering information on the missing covariates especially in situations where the missing data fraction is large. Properties of the proposed prior and resulting posterior distributions are examined. Also, model checking techniques are proposed for sensitivity analyses and for checking the goodness of fit of a particular model. Specifically, we extend the Conditional Predictive Ordinate (CPO) statistic to assess goodness of fit in the presence of missing covariate data. Computational techniques using the Gibbs sampler are implemented. A real data set involving a melanoma cancer clinical trial is examined to demonstrate the methodology. 相似文献