首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Summary.  In longitudinal studies, missingness of data is often an unavoidable problem. Estimators from the linear mixed effects model assume that missing data are missing at random. However, estimators are biased when this assumption is not met. In the paper, theoretical results for the asymptotic bias are established under non-ignorable drop-out, drop-in and other missing data patterns. The asymptotic bias is large when the drop-out subjects have only one or no observation, especially for slope-related parameters of the linear mixed effects model. In the drop-in case, intercept-related parameter estimators show substantial asymptotic bias when subjects enter late in the study. Eight other missing data patterns are considered and these produce asymptotic biases of a variety of magnitudes.  相似文献   

2.
Pharmacokinetic (PK) data often contain concentration measurements below the quantification limit (BQL). While specific values cannot be assigned to these observations, nevertheless these observed BQL data are informative and generally known to be lower than the lower limit of quantification (LLQ). Setting BQLs as missing data violates the usual missing at random (MAR) assumption applied to the statistical methods, and therefore leads to biased or less precise parameter estimation. By definition, these data lie within the interval [0, LLQ], and can be considered as censored observations. Statistical methods that handle censored data, such as maximum likelihood and Bayesian methods, are thus useful in the modelling of such data sets. The main aim of this work was to investigate the impact of the amount of BQL observations on the bias and precision of parameter estimates in population PK models (non‐linear mixed effects models in general) under maximum likelihood method as implemented in SAS and NONMEM, and a Bayesian approach using Markov chain Monte Carlo (MCMC) as applied in WinBUGS. A second aim was to compare these different methods in dealing with BQL or censored data in a practical situation. The evaluation was illustrated by simulation based on a simple PK model, where a number of data sets were simulated from a one‐compartment first‐order elimination PK model. Several quantification limits were applied to each of the simulated data to generate data sets with certain amounts of BQL data. The average percentage of BQL ranged from 25% to 75%. Their influence on the bias and precision of all population PK model parameters such as clearance and volume distribution under each estimation approach was explored and compared. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

3.
This study compares two methods for handling missing data in longitudinal trials: one using the last-observation-carried-forward (LOCF) method and one based on a multivariate or mixed model for repeated measurements (MMRM). Using data sets simulated to match six actual trials, I imposed several drop-out mechanisms, and compared the methods in terms of bias in the treatment difference and power of the treatment comparison. With equal drop-out in Active and Placebo arms, LOCF generally underestimated the treatment effect; but with unequal drop-out, bias could be much larger and in either direction. In contrast, bias with the MMRM method was much smaller; and whereas MMRM rarely caused a difference in power of greater than 20%, LOCF caused a difference in power of greater than 20% in nearly half the simulations. Use of the LOCF method is therefore likely to misrepresent the results of a trial seriously, and so is not a good choice for primary analysis. In contrast, the MMRM method is unlikely to result in serious misinterpretation, unless the drop-out mechanism is missing not at random (MNAR) and there is substantially unequal drop-out. Moreover, MMRM is clearly more reliable and better grounded statistically. Neither method is capable of dealing on its own with trials involving MNAR drop-out mechanisms, for which sensitivity analysis is needed using more complex methods.  相似文献   

4.
Summary.  The main statistical problem in many epidemiological studies which involve repeated measurements of surrogate markers is the frequent occurrence of missing data. Standard likelihood-based approaches like the linear random-effects model fail to give unbiased estimates when data are non-ignorably missing. In human immunodeficiency virus (HIV) type 1 infection, two markers which have been widely used to track progression of the disease are CD4 cell counts and HIV–ribonucleic acid (RNA) viral load levels. Repeated measurements of these markers tend to be informatively censored, which is a special case of non-ignorable missingness. In such cases, we need to apply methods that jointly model the observed data and the missingness process. Despite their high correlation, longitudinal data of these markers have been analysed independently by using mainly random-effects models. Touloumi and co-workers have proposed a model termed the joint multivariate random-effects model which combines a linear random-effects model for the underlying pattern of the marker with a log-normal survival model for the drop-out process. We extend the joint multivariate random-effects model to model simultaneously the CD4 cell and viral load data while adjusting for informative drop-outs due to disease progression or death. Estimates of all the model's parameters are obtained by using the restricted iterative generalized least squares method or a modified version of it using the EM algorithm as a nested algorithm in the case of censored survival data taking also into account non-linearity in the HIV–RNA trend. The method proposed is evaluated and compared with simpler approaches in a simulation study. Finally the method is applied to a subset of the data from the 'Concerted action on seroconversion to AIDS and death in Europe' study.  相似文献   

5.
Missing data methods, maximum likelihood estimation (MLE) and multiple imputation (MI), for longitudinal questionnaire data were investigated via simulation. Predictive mean matching (PMM) was applied at both item and scale levels, logistic regression at item level and multivariate normal imputation at scale level. We investigated a hybrid approach which is combination of MLE and MI, i.e. scales from the imputed data are eliminated if all underlying items were originally missing. Bias and mean square error (MSE) for parameter estimates were examined. ML seemed to provide occasionally the best results in terms of bias, but hardly ever on MSE. All imputation methods at the scale level and logistic regression at item level hardly ever showed the best performance. The hybrid approach is similar or better than its original MI. The PMM-hybrid approach at item level demonstrated the best MSE for most settings and in some cases also the smallest bias.  相似文献   

6.
This work provides a set of macros performed with SAS (Statistical Analysis System) for Windows, which can be used to fit conditional models under intermittent missingness in longitudinal data. A formalized transition model, including random effects for individuals and measurement error, is presented. Model fitting is based on the missing completely at random or missing at random assumptions, and the separability condition. The problem translates to maximization of the marginal observed data density only, which for Gaussian data is again Gaussian, meaning that the likelihood can be expressed in terms of the mean and covariance matrix of the observed data vector. A simulation study is presented and misspecification issues are considered. A practical application is also given, where conditional models are fitted to the data from a clinical trial that assessed the effect of a Cuban medicine on a disease of the respiratory system.  相似文献   

7.
In clinical practice, the profile of each subject's CD4 response from a longitudinal study may follow a ‘broken stick’ like trajectory, indicating multiple phases of increase and/or decline in response. Such multiple phases (changepoints) may be important indicators to help quantify treatment effect and improve management of patient care. Although it is a common practice to analyze complex AIDS longitudinal data using nonlinear mixed-effects (NLME) or nonparametric mixed-effects (NPME) models in the literature, NLME or NPME models become a challenge to estimate changepoint due to complicated structures of model formulations. In this paper, we propose a changepoint mixed-effects model with random subject-specific parameters, including the changepoint for the analysis of longitudinal CD4 cell counts for HIV infected subjects following highly active antiretroviral treatment. The longitudinal CD4 data in this study may exhibit departures from symmetry, may encounter missing observations due to various reasons, which are likely to be non-ignorable in the sense that missingness may be related to the missing values, and may be censored at the time of the subject going off study-treatment, which is a potentially informative dropout mechanism. Inferential procedures can be complicated dramatically when longitudinal CD4 data with asymmetry (skewness), incompleteness and informative dropout are observed in conjunction with an unknown changepoint. Our objective is to address the simultaneous impact of skewness, missingness and informative censoring by jointly modeling the CD4 response and dropout time processes under a Bayesian framework. The method is illustrated using a real AIDS data set to compare potential models with various scenarios, and some interested results are presented.  相似文献   

8.
This study investigated the bias of factor loadings obtained from incomplete questionnaire data with imputed scores. Three models were used to generate discrete ordered rating scale data typical of questionnaires, also known as Likert data. These methods were the multidimensional polytomous latent trait model, a normal ogive item response theory model, and the discretized normal model. Incomplete data due to nonresponse were simulated using either missing completely at random or not missing at random mechanisms. Subsequently, for each incomplete data matrix, four imputation methods were applied for imputing item scores. Based on a completely crossed six-factor design, it was concluded that in general, bias was small for all data simulation methods and all imputation methods, and under all nonresponse mechanisms. Imputation method, two-way-plus-error, had the smallest bias in the factor loadings. Bias based on the discretized normal model was greater than that based on the other two models.  相似文献   

9.
Objectives in many longitudinal studies of individuals infected with the human immunodeficiency virus (HIV) include the estimation of population average trajectories of HIV ribonucleic acid (RNA) over time and tests for differences in trajectory across subgroups. Special features that are often inherent in the underlying data include a tendency for some HIV RNA levels to be below an assay detection limit, and for individuals with high initial levels or high ranges of change to drop out of the study early because of illness or death. We develop a likelihood for the observed data that incorporates both of these features. Informative drop-outs are handled by means of an approach previously published by Schluchter. Using data from the HIV Epidemiology Research Study, we implement a maximum likelihood procedure to estimate initial HIV RNA levels and slopes within a population, compare these parameters across subgroups of HIV-infected women and illustrate the importance of appropriate treatment of left censoring and informative drop-outs. We also assess model assumptions and consider the prediction of random intercepts and slopes in this setting. The results suggest that marked bias in estimates of fixed effects, variance components and standard errors in the analysis of HIV RNA data might be avoided by the use of methods like those illustrated.  相似文献   

10.
A full likelihood method is proposed to analyse continuous longitudinal data with non-ignorable (informative) missing values and non-monotone patterns. The problem arose in a breast cancer clinical trial where repeated assessments of quality of life were collected: patients rated their coping ability during and after treatment. We allow the missingness probabilities to depend on unobserved responses, and we use a multivariate normal model for the outcomes. A first-order Markov dependence structure for the responses is a natural choice and facilitates the construction of the likelihood; estimates are obtained via the Nelder–Mead simplex algorithm. Computations are difficult and become intractable with more than three or four assessments. Applying the method to the quality-of-life data results in easily interpretable estimates, confirms the suspicion that the data are non-ignorably missing and highlights the likely bias of standard methods. Although treatment comparisons are not affected here, the methods are useful for obtaining unbiased means and estimating trends over time.  相似文献   

11.
Summary.  Recurrent events models have had considerable attention recently. The majority of approaches show the consistency of parameter estimates under the assumption that censoring is independent of the recurrent events process of interest conditional on the covariates that are included in the model. We provide an overview of available recurrent events analysis methods and present an inverse probability of censoring weighted estimator for the regression parameters in the Andersen–Gill model that is commonly used for recurrent event analysis. This estimator remains consistent under informative censoring if the censoring mechanism is estimated consistently, and it generally improves on the naïve estimator for the Andersen–Gill model in the case of independent censoring. We illustrate the bias of ad hoc estimators in the presence of informative censoring with a simulation study and provide a data analysis of recurrent lung exacerbations in cystic fibrosis patients when some patients are lost to follow-up.  相似文献   

12.
A large number of models have been derived from the two-parameter Weibull distribution including the inverse Weibull (IW) model which is found suitable for modeling the complex failure data set. In this paper, we present the Bayesian inference for the mixture of two IW models. For this purpose, the Bayes estimates of the parameters of the mixture model along with their posterior risks using informative as well as the non-informative prior are obtained. These estimates have been attained considering two cases: (a) when the shape parameter is known and (b) when all parameters are unknown. For the former case, Bayes estimates are obtained under three loss functions while for the latter case only the squared error loss function is used. Simulation study is carried out in order to explore numerical aspects of the proposed Bayes estimators. A real-life data set is also presented for both cases, and parameters obtained under case when shape parameter is known are tested through testing of hypothesis procedure.  相似文献   

13.
We consider inference in randomized longitudinal studies with missing data that is generated by skipped clinic visits and loss to follow-up. In this setting, it is well known that full data estimands are not identified unless unverified assumptions are imposed. We assume a non-future dependence model for the drop-out mechanism and partial ignorability for the intermittent missingness. We posit an exponential tilt model that links non-identifiable distributions and distributions identified under partial ignorability. This exponential tilt model is indexed by non-identified parameters, which are assumed to have an informative prior distribution, elicited from subject-matter experts. Under this model, full data estimands are shown to be expressed as functionals of the distribution of the observed data. To avoid the curse of dimensionality, we model the distribution of the observed data using a Bayesian shrinkage model. In a simulation study, we compare our approach to a fully parametric and a fully saturated model for the distribution of the observed data. Our methodology is motivated by, and applied to, data from the Breast Cancer Prevention Trial.  相似文献   

14.
We propose a latent Markov quantile regression model for longitudinal data with non-informative drop-out. The observations, conditionally on covariates, are modeled through an asymmetric Laplace distribution. Random effects are assumed to be time-varying and to follow a first order latent Markov chain. This latter assumption is easily interpretable and allows exact inference through an ad hoc EM-type algorithm based on appropriate recursions. Finally, we illustrate the model on a benchmark data set.  相似文献   

15.
《统计学通讯:理论与方法》2012,41(16-17):3278-3300
Under complex survey sampling, in particular when selection probabilities depend on the response variable (informative sampling), the sample and population distributions are different, possibly resulting in selection bias. This article is concerned with this problem by fitting two statistical models, namely: the variance components model (a two-stage model) and the fixed effects model (a single-stage model) for one-way analysis of variance, under complex survey design, for example, two-stage sampling, stratification, and unequal probability of selection, etc. Classical theory underlying the use of the two-stage model involves simple random sampling for each of the two stages. In such cases the model in the sample, after sample selection, is the same as model for the population; before sample selection. When the selection probabilities are related to the values of the response variable, standard estimates of the population model parameters may be severely biased, leading possibly to false inference. The idea behind the approach is to extract the model holding for the sample data as a function of the model in the population and of the first order inclusion probabilities. And then fit the sample model, using analysis of variance, maximum likelihood, and pseudo maximum likelihood methods of estimation. The main feature of the proposed techniques is related to their behavior in terms of the informativeness parameter. We also show that the use of the population model that ignores the informative sampling design, yields biased model fitting.  相似文献   

16.
Non ignorable missing data is a common problem in longitudinal studies. Latent class models are attractive for simplifying the modeling of missing data when the data are subject to either a monotone or intermittent missing data pattern. In our study, we propose a new two-latent-class model for categorical data with informative dropouts, dividing the observed data into two latent classes; one class in which the outcomes are deterministic and a second one in which the outcomes can be modeled using logistic regression. In the model, the latent classes connect the longitudinal responses and the missingness process under the assumption of conditional independence. Parameters are estimated by the method of maximum likelihood estimation based on the above assumptions and the tetrachoric correlation between responses within the same subject. We compare the proposed method with the shared parameter model and the weighted GEE model using the areas under the ROC curves in the simulations and the application to the smoking cessation data set. The simulation results indicate that the proposed two-latent-class model performs well under different missing procedures. The application results show that our proposed method is better than the shared parameter model and the weighted GEE model.  相似文献   

17.
This article considers a discrete-time Markov chain for modeling transition probabilities when multiple successive observations are missing at random between two observed outcomes using three methods: a na\"?ve analog of complete-case analysis using the observed one-step transitions alone, a non data-augmentation method (NL) by solving nonlinear equations, and a data-augmentation method, the Expectation-Maximization (EM) algorithm. The explicit form of the conditional log-likelihood given the observed information as required by the E step is provided, and the iterative formula in the M step is expressed in a closed form. An empirical study was performed to examine the accuracy and precision of the estimates obtained in the three methods under ignorable missing mechanisms of missing completely at random and missing at random. A dataset from the mental health arena was used for illustration. It was found that both data-augmentation and nonaugmentation methods provide accurate and precise point estimation, and that the na\"?ve method resulted in estimates of the transition probabilities with similar bias but larger MSE. The NL method and the EM algorithm in general provide similar results whereas the latter provides conditional expected row margins leading to smaller standard errors.  相似文献   

18.
The estimation of the mixtures of regression models is usually based on the normal assumption of components and maximum likelihood estimation of the normal components is sensitive to noise, outliers, or high-leverage points. Missing values are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this article, we propose the mixtures of regression models for contaminated incomplete heterogeneous data. The proposed models provide robust estimates of regression coefficients varying across latent subgroups even under the presence of missing values. The methodology is illustrated through simulation studies and a real data analysis.  相似文献   

19.
This paper deals with small area indirect estimators under area level random effect models when only area level data are available and the random effects are correlated. The performance of the Spatial Empirical Best Linear Unbiased Predictor (SEBLUP) is explored with a Monte Carlo simulation study on lattice data and it is applied to the results of the sample survey on Life Conditions in Tuscany (Italy). The mean squared error (MSE) problem is discussed illustrating the MSE estimator in comparison with the MSE of the empirical sampling distribution of SEBLUP estimator. A clear tendency in our empirical findings is that the introduction of spatially correlated random area effects reduce both the variance and the bias of the EBLUP estimator. Despite some residual bias, the coverage rate of our confidence intervals comes close to a nominal 95%.  相似文献   

20.
Two important models in survival analysis are that of general random censorship and the proportional hazards submodel of Koziol and Green. The difference between the two models is the way in which the lifetime variable is censored (informative versus non-informative censoring). In this paper the two viewpoints are combined into a new model which allows the lifetimes to be censored by two types of variables, one of which censors in an informative way and the other one in a non-informative way. The lifetimes and the censoring times are also allowed to depend on covariates in a very general way. The estimator for the conditional distribution of the lifetimes generalizes that of Gather and Pawlitschko (1998. Metrika 48, 189–209), who recently studied the situation without covariate information. Results obtained are the uniform strong consistency (with rate), an almost sure asymptotic representation and the weak convergence of the process.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号