期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Using multiple imputation to estimate cumulative distribution functions in longitudinal data analysis with data missing at random

Phillip Dinh 《Pharmaceutical statistics》2013,12(5):260-267

In longitudinal clinical studies, after randomization at baseline, subjects are followed for a period of time for development of symptoms. The interested inference could be the mean change from baseline to a particular visit in some lab values, the proportion of responders to some threshold category at a particular visit post baseline, or the time to some important event. However, in some applications, the interest may be in estimating the cumulative distribution function (CDF) at a fixed time point post baseline. When the data are fully observed, the CDF can be estimated by the empirical CDF. When patients discontinue prematurely during the course of the study, the empirical CDF cannot be directly used. In this paper, we use multiple imputation as a way to estimate the CDF in longitudinal studies when data are missing at random. The validity of the method is assessed on the basis of the bias and the Kolmogorov–Smirnov distance. The results suggest that multiple imputation yields less bias and less variability than the often used last observation carried forward method. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

2.

Choice of the primary analysis in longitudinal clinical trials

Craig H. Mallinckrodt John G. Watkin Geert Molenberghs Raymond J. Carroll 《Pharmaceutical statistics》2004,3(3):161-169

Missing data, and the bias they can cause, are an almost ever‐present concern in clinical trials. The last observation carried forward (LOCF) approach has been frequently utilized to handle missing data in clinical trials, and is often specified in conjunction with analysis of variance (LOCF ANOVA) for the primary analysis. Considerable advances in statistical methodology, and in our ability to implement these methods, have been made in recent years. Likelihood‐based, mixed‐effects model approaches implemented under the missing at random (MAR) framework are now easy to implement, and are commonly used to analyse clinical trial data. Furthermore, such approaches are more robust to the biases from missing data, and provide better control of Type I and Type II errors than LOCF ANOVA. Empirical research and analytic proof have demonstrated that the behaviour of LOCF is uncertain, and in many situations it has not been conservative. Using LOCF as a composite measure of safety, tolerability and efficacy can lead to erroneous conclusions regarding the effectiveness of a drug. This approach also violates the fundamental basis of statistics as it involves testing an outcome that is not a physical parameter of the population, but rather a quantity that can be influenced by investigator behaviour, trial design, etc. Practice should shift away from using LOCF ANOVA as the primary analysis and focus on likelihood‐based, mixed‐effects model approaches developed under the MAR framework, with missing not at random methods used to assess robustness of the primary analysis. Copyright © 2004 John Wiley & Sons, Ltd. 相似文献

3.

Missing data in clinical trials: from clinical assumptions to statistical analysis using pattern mixture models

Bohdana Ratitch Michael O'Kelly Robert Tosiello 《Pharmaceutical statistics》2013,12(6):337-347

The need to use rigorous, transparent, clearly interpretable, and scientifically justified methodology for preventing and dealing with missing data in clinical trials has been a focus of much attention from regulators, practitioners, and academicians over the past years. New guidelines and recommendations emphasize the importance of minimizing the amount of missing data and carefully selecting primary analysis methods on the basis of assumptions regarding the missingness mechanism suitable for the study at hand, as well as the need to stress‐test the results of the primary analysis under different sets of assumptions through a range of sensitivity analyses. Some methods that could be effectively used for dealing with missing data have not yet gained widespread usage, partly because of their underlying complexity and partly because of lack of relatively easy approaches to their implementation. In this paper, we explore several strategies for missing data on the basis of pattern mixture models that embody clear and realistic clinical assumptions. Pattern mixture models provide a statistically reasonable yet transparent framework for translating clinical assumptions into statistical analyses. Implementation details for some specific strategies are provided in an Appendix (available online as Supporting Information), whereas the general principles of the approach discussed in this paper can be used to implement various other analyses with different sets of assumptions regarding missing data. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

4.

Missing data in principal component analysis of questionnaire data: a comparison of methods

《Journal of Statistical Computation and Simulation》2012,82(11):2298-2315

Principal component analysis (PCA) is a widely used statistical technique for determining subscales in questionnaire data. As in any other statistical technique, missing data may both complicate its execution and interpretation. In this study, six methods for dealing with missing data in the context of PCA are reviewed and compared: listwise deletion (LD), pairwise deletion, the missing data passive approach, regularized PCA, the expectation-maximization algorithm, and multiple imputation. Simulations show that except for LD, all methods give about equally good results for realistic percentages of missing data. Therefore, the choice of a procedure can be based on the ease of application or purely the convenience of availability of a technique. 相似文献

5.

A Bayesian hierarchical model for categorical longitudinal data from a social survey of immigrants

A. N. Pettitt T. T. Tran M. A. Haynes J. L. Hay 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2006,169(1):97-114

Summary. The paper investigates a Bayesian hierarchical model for the analysis of categorical longitudinal data from a large social survey of immigrants to Australia. Data for each subject are observed on three separate occasions, or waves, of the survey. One of the features of the data set is that observations for some variables are missing for at least one wave. A model for the employment status of immigrants is developed by introducing, at the first stage of a hierarchical model, a multinomial model for the response and then subsequent terms are introduced to explain wave and subject effects. To estimate the model, we use the Gibbs sampler, which allows missing data for both the response and the explanatory variables to be imputed at each iteration of the algorithm, given some appropriate prior distributions. After accounting for significant covariate effects in the model, results show that the relative probability of remaining unemployed diminished with time following arrival in Australia. 相似文献

6.

Modeling longitudinal count data with dropouts

Mohamed Alosh 《Pharmaceutical statistics》2010,9(1):35-45

This paper explores the utility of different approaches for modeling longitudinal count data with dropouts arising from a clinical study for the treatment of actinic keratosis lesions on the face and balding scalp. A feature of these data is that as the disease for subjects on the active arm improves their data show larger dispersion compared with those on the vehicle, exhibiting an over‐dispersion relative to the Poisson distribution. After fitting the marginal (or population averaged) model using the generalized estimating equation (GEE), we note that inferences from such a model might be biased as dropouts are treatment related. Then, we consider using a weighted GEE (WGEE) where each subject's contribution to the analysis is weighted inversely by the subject's probability of dropout. Based on the model findings, we argue that the WGEE might not address the concerns about the impact of dropouts on the efficacy findings when dropouts are treatment related. As an alternative, we consider likelihood‐based inference where random effects are added to the model to allow for heterogeneity across subjects. Finally, we consider a transition model where, unlike the previous approaches that model the log‐link function of the mean response, we model the subject's actual lesion counts. This model is an extension of the Poisson autoregressive model of order 1, where the autoregressive parameter is taken to be a function of treatment as well as other covariates to induce different dispersions and correlations for the two treatment arms. We conclude with a discussion about model selection. Published in 2009 by John Wiley & Sons, Ltd. 相似文献

7.

Handling of missing data in long‐term clinical trials: a case study

Mark Janssens Geert Molenberghs René Kerstens 《Pharmaceutical statistics》2012,11(6):442-448

Missing data in clinical trials is a well‐known problem, and the classical statistical methods used can be overly simple. This case study shows how well‐established missing data theory can be applied to efficacy data collected in a long‐term open‐label trial with a discontinuation rate of almost 50%. Satisfaction with treatment in chronically constipated patients was the efficacy measure assessed at baseline and every 3 months postbaseline. The improvement in treatment satisfaction from baseline was originally analyzed with a paired t‐test ignoring missing data and discarding the correlation structure of the longitudinal data. As the original analysis started from missing completely at random assumptions regarding the missing data process, the satisfaction data were re‐examined, and several missing at random (MAR) and missing not at random (MNAR) techniques resulted in adjusted estimate for the improvement in satisfaction over 12 months. Throughout the different sensitivity analyses, the effect sizes remained significant and clinically relevant. Thus, even for an open‐label trial design, sensitivity analysis, with different assumptions for the nature of dropouts (MAR or MNAR) and with different classes of models (selection, pattern‐mixture, or multiple imputation models), has been found useful and provides evidence towards the robustness of the original analyses; additional sensitivity analyses could be undertaken to further qualify robustness. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献

8.

Review of guidelines and literature for handling missing data in longitudinal clinical trials with a case study 总被引：1，自引：0，他引：1

Liu M Wei L Zhang J 《Pharmaceutical statistics》2006,5(1):7-18

Missing data in clinical trials are inevitable. We highlight the ICH guidelines and CPMP points to consider on missing data. Specifically, we outline how we should consider missing data issues when designing, planning and conducting studies to minimize missing data impact. We also go beyond the coverage of the above two documents, provide a more detailed review of the basic concepts of missing data and frequently used terminologies, and examples of the typical missing data mechanism, and discuss technical details and literature for several frequently used statistical methods and associated software. Finally, we provide a case study where the principles outlined in this paper are applied to one clinical program at protocol design, data analysis plan and other stages of a clinical trial. 相似文献

9.

Testing the proportional odds assumption in multiply imputed ordinal longitudinal data

A.F. Donneau M. Mauer P. Lambert E. Lesaffre A. Albert 《Journal of applied statistics》2015,42(10):2257-2279

A popular choice when analyzing ordinal data is to consider the cumulative proportional odds model to relate the marginal probabilities of the ordinal outcome to a set of covariates. However, application of this model relies on the condition of identical cumulative odds ratios across the cut-offs of the ordinal outcome; the well-known proportional odds assumption. This paper focuses on the assessment of this assumption while accounting for repeated and missing data. In this respect, we develop a statistical method built on multiple imputation (MI) based on generalized estimating equations that allows to test the proportionality assumption under the missing at random setting. The performance of the proposed method is evaluated for two MI algorithms for incomplete longitudinal ordinal data. The impact of both MI methods is compared with respect to the type I error rate and the power for situations covering various numbers of categories of the ordinal outcome, sample sizes, rates of missingness, well-balanced and skewed data. The comparison of both MI methods with the complete-case analysis is also provided. We illustrate the use of the proposed methods on a quality of life data from a cancer clinical trial. 相似文献

10.

Practical modeling strategies for unbalanced longitudinal data analysis

Enrico A. Colosimo Maria Arlene Fausto Marta Afonso Freitas Jorge Andrade Pinto 《Journal of applied statistics》2012,39(9):2005-2013

In practice, data are often measured repeatedly on the same individual at several points in time. Main interest often relies in characterizing the way the response changes in time, and the predictors of that change. Marginal, mixed and transition are frequently considered to be the main models for continuous longitudinal data analysis. These approaches are proposed primarily for balanced longitudinal design. However, in clinic studies, data are usually not balanced and some restrictions are necessary in order to use these models. This paper was motivated by a data set related to longitudinal height measurements in children of HIV-infected mothers that was recorded at the university hospital of the Federal University in Minas Gerais, Brazil. This data set is severely unbalanced. The goal of this paper is to assess the application of continuous longitudinal models for the analysis of unbalanced data set. 相似文献

11.

Missing data: Discussion points from the PSI missing data expert group

Tomasz Burzykowski James Carpenter Corneel Coens Daniel Evans Lesley France Mike Kenward Peter Lane James Matcham David Morgan Alan Phillips James Roger Brian Sullivan Ian White Ly‐Mee Yu of the PSI Missing Data Expert Group 《Pharmaceutical statistics》2010,9(4):288-297

The Points to Consider Document on Missing Data was adopted by the Committee of Health and Medicinal Products (CHMP) in December 2001. In September 2007 the CHMP issued a recommendation to review the document, with particular emphasis on summarizing and critically appraising the pattern of drop‐outs, explaining the role and limitations of the ‘last observation carried forward’ method and describing the CHMP's cautionary stance on the use of mixed models. In preparation for the release of the updated guidance document, statisticians in the Pharmaceutical Industry held a one‐day expert group meeting in September 2008. Topics that were debated included minimizing the extent of missing data and understanding the missing data mechanism, defining the principles for handling missing data and understanding the assumptions underlying different analysis methods. A clear message from the meeting was that at present, biostatisticians tend only to react to missing data. Limited pro‐active planning is undertaken when designing clinical trials. Missing data mechanisms for a trial need to be considered during the planning phase and the impact on the objectives assessed. Another area for improvement is in the understanding of the pattern of missing data observed during a trial and thus the missing data mechanism via the plotting of data; for example, use of Kaplan–Meier curves looking at time to withdrawal. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

12.

A structured approach to choosing estimands and estimators in longitudinal clinical trials

C. H. Mallinckrodt Q. Lin I. Lipkovich G. Molenberghs 《Pharmaceutical statistics》2012,11(6):456-461

An important evolution in the missing data arena has been the recognition of need for clarity in objectives. The objectives of primary focus in clinical trials can often be categorized as assessing efficacy or effectiveness. The present investigation illustrated a structured framework for choosing estimands and estimators when testing investigational drugs to treat the symptoms of chronic illnesses. Key issues were discussed and illustrated using a reanalysis of the confirmatory trials from a new drug application in depression. The primary analysis used a likelihood‐based approach to assess efficacy: mean change to the planned endpoint of the trial assuming patients stayed on drug. Secondarily, effectiveness was assessed using a multiple imputation approach. The imputation model—derived solely from the placebo group—was used to impute missing values for both the drug and placebo groups. Therefore, this so‐called placebo multiple imputation (a.k.a. controlled imputation) approach assumed patients had reduced benefit from the drug after discontinuing it. Results from the example data provided clear evidence of efficacy for the experimental drug and characterized its effectiveness. Data after discontinuation of study medication were not required for these analyses. Given the idiosyncratic nature of drug development, no estimand or approach is universally appropriate. However, the general practice of pairing efficacy and effectiveness estimands may often be useful in understanding the overall risks and benefits of a drug. Controlled imputation approaches, such as placebo multiple imputation, can be a flexible and transparent framework for formulating primary analyses of effectiveness estimands and sensitivity analyses for efficacy estimands. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献

13.

Impact of the non-distinctness and non-ignorability on the inference by multiple imputation in multivariate multilevel data: a simulation assessment

Recai Yucel 《Journal of Statistical Computation and Simulation》2017,87(9):1813-1826

Multiple imputation (MI) is an increasingly popular method for analysing incomplete multivariate data sets. One of the most crucial assumptions of this method relates to mechanism leading to missing data. Distinctness is typically assumed, which indicates a complete independence of mechanisms underlying missingness and data generation. In addition, missing at random or missing completely at random is assumed, which explicitly states under which conditions missingness is independent of observed data. Despite common use of MI under these assumptions, plausibility and sensitivity to these fundamental assumptions have not been well-investigated. In this work, we investigate the impact of non-distinctness and non-ignorability. In particular, non-ignorability is due to unobservable cluster-specific effects (e.g. random-effects). Through a comprehensive simulation study, we show that MI inferences suggest that nonignoriability due to non-distinctness do not immediately imply dismal performance while non-ignorability due to missing not at random leads to quite subpar performance. 相似文献

14.

The impact of missing data and how it is handled on the rate of false-positive results in drug development

Barnes SA Mallinckrodt CH Lindborg SR Carter MK 《Pharmaceutical statistics》2008,7(3):215-225

In drug development, a common choice for the primary analysis is to assess mean changes via analysis of (co)variance with missing data imputed by carrying the last or baseline observations forward (LOCF, BOCF). These approaches assume that data are missing completely at random (MCAR). Multiple imputation (MI) and likelihood-based repeated measures (MMRM) are less restrictive as they assume data are missing at random (MAR). Nevertheless, LOCF and BOCF remain popular, perhaps because it is thought that the bias in these methods lead to protection against falsely concluding that a drug is more effective than the control. We conducted a simulation study that compared the rate of false positive results or regulatory risk error (RRE) from BOCF, LOCF, MI, and MMRM in 32 scenarios that were generated from a 2(5) full factorial arrangement with data missing due to a missing not at random (MNAR) mechanism. Both BOCF and LOCF inflated RRE were compared to MI and MMRM. In 12 of the 32 scenarios, BOCF yielded inflated RRE compared with eight scenarios for LOCF, three scenarios for MI and four scenarios for MMRM. In no situation did BOCF or LOCF provide adequate control of RRE when MI and MMRM did not. Both MI and MMRM are better choices than either BOCF or LOCF for the primary analysis. 相似文献

15.

Type I error rates from likelihood‐based repeated measures analyses of incomplete longitudinal data

Craig H. Mallinckrodt Christopher J. Kaiser John G. Watkin Michael J. Detke Geert Molenberghs Raymond J. Carroll 《Pharmaceutical statistics》2004,3(3):171-186

The last observation carried forward (LOCF) approach is commonly utilized to handle missing values in the primary analysis of clinical trials. However, recent evidence suggests that likelihood‐based analyses developed under the missing at random (MAR) framework are sensible alternatives. The objective of this study was to assess the Type I error rates from a likelihood‐based MAR approach – mixed‐model repeated measures (MMRM) – compared with LOCF when estimating treatment contrasts for mean change from baseline to endpoint (Δ). Data emulating neuropsychiatric clinical trials were simulated in a 4 × 4 factorial arrangement of scenarios, using four patterns of mean changes over time and four strategies for deleting data to generate subject dropout via an MAR mechanism. In data with no dropout, estimates of Δ and SE_Δ from MMRM and LOCF were identical. In data with dropout, the Type I error rates (averaged across all scenarios) for MMRM and LOCF were 5.49% and 16.76%, respectively. In 11 of the 16 scenarios, the Type I error rate from MMRM was at least 1.00% closer to the expected rate of 5.00% than the corresponding rate from LOCF. In no scenario did LOCF yield a Type I error rate that was at least 1.00% closer to the expected rate than the corresponding rate from MMRM. The average estimate of SE_Δ from MMRM was greater in data with dropout than in complete data, whereas the average estimate of SE_Δ from LOCF was smaller in data with dropout than in complete data, suggesting that standard errors from MMRM better reflected the uncertainty in the data. The results from this investigation support those from previous studies, which found that MMRM provided reasonable control of Type I error even in the presence of MNAR missingness. No universally best approach to analysis of longitudinal data exists. However, likelihood‐based MAR approaches have been shown to perform well in a variety of situations and are a sensible alternative to the LOCF approach. MNAR methods can be used within a sensitivity analysis framework to test the potential presence and impact of MNAR data, thereby assessing robustness of results from an MAR method. Copyright © 2004 John Wiley & Sons, Ltd. 相似文献

16.

Growth curve analysis of complete and incomplete longitudinal data

Robert F. Woolson James D. Leeper 《统计学通讯:理论与方法》2013,42(14):1491-1513

Growth curve analysis is reviewed for both complete and incomplete data structures. Specific attention is paid to the incomplete data model of Kleinbaum and its applicability to the mixed longitudinal study. Design considerations and efficiency for various incomplete longitudinal studies are also discussed. 相似文献

17.

Imputation of missing variance data using non-linear mixed effects modelling to enable an inverse variance weighted meta-analysis of summary-level longitudinal data: a case study

Boucher M 《Pharmaceutical statistics》2012,11(4):318-324

Missing variances, on the basis of the summary-level data, can be a problem when an inverse variance weighted meta-analysis is undertaken. A wide range of approaches in dealing with this issue exist, such as excluding data without a variance measure, using a function of sample size as a weight and imputing the missing standard errors/deviations. A non-linear mixed effects modelling approach was taken to describe the time-course of standard deviations across 14 studies. The model was then used to make predictions of the missing standard deviations, thus, enabling a precision weighted model-based meta-analysis of a mean pain endpoint over time. Maximum likelihood and Bayesian approaches were implemented with example code to illustrate how this imputation can be carried out and to compare the output from each method. The resultant imputations were nearly identical for the two approaches. This modelling approach acknowledges the fact that standard deviations are not necessarily constant over time and can differ between treatments and across studies in a predictable way. 相似文献

18.

Different methods for handling incomplete longitudinal binary outcome due to missing at random dropout

《Statistical Methodology》2015

This paper compares the performance of weighted generalized estimating equations (WGEEs), multiple imputation based on generalized estimating equations (MI-GEEs) and generalized linear mixed models (GLMMs) for analyzing incomplete longitudinal binary data when the underlying study is subject to dropout. The paper aims to explore the performance of the above methods in terms of handling dropouts that are missing at random (MAR). The methods are compared on simulated data. The longitudinal binary data are generated from a logistic regression model, under different sample sizes. The incomplete data are created for three different dropout rates. The methods are evaluated in terms of bias, precision and mean square error in case where data are subject to MAR dropout. In conclusion, across the simulations performed, the MI-GEE method performed better in both small and large sample sizes. Evidently, this should not be seen as formal and definitive proof, but adds to the body of knowledge about the methods’ relative performance. In addition, the methods are compared using data from a randomized clinical trial. 相似文献

19.

Missing data techniques for multilevel data: implications of model misspecification

Anne C. Black Ofer Harel D. Betsy McCoach 《Journal of applied statistics》2011,38(9):1845-1865

When modeling multilevel data, it is important to accurately represent the interdependence of observations within clusters. Ignoring data clustering may result in parameter misestimation. However, it is not well established to what degree parameter estimates are affected by model misspecification when applying missing data techniques (MDTs) to incomplete multilevel data. We compare the performance of three MDTs with incomplete hierarchical data. We consider the impact of imputation model misspecification on the quality of parameter estimates by employing multiple imputation under assumptions of a normal model (MI/NM) with two-level cross-sectional data when values are missing at random on the dependent variable at rates of 10%, 30%, and 50%. Five criteria are used to compare estimates from MI/NM to estimates from MI assuming a linear mixed model (MI/LMM) and maximum likelihood estimation to the same incomplete data sets. With 10% missing data (MD), techniques performed similarly for fixed-effects estimates, but variance components were biased with MI/NM. Effects of model misspecification worsened at higher rates of MD, with the hierarchical structure of the data markedly underrepresented by biased variance component estimates. MI/LMM and maximum likelihood provided generally accurate and unbiased parameter estimates but performance was negatively affected by increased rates of MD. 相似文献

20.

Reference‐based sensitivity analysis for time‐to‐event data

Andrew Atkinson Michael G. Kenward Tim Clayton James R. Carpenter 《Pharmaceutical statistics》2019,18(6):645-658

The analysis of time‐to‐event data typically makes the censoring at random assumption, ie, that—conditional on covariates in the model—the distribution of event times is the same, whether they are observed or unobserved (ie, right censored). When patients who remain in follow‐up stay on their assigned treatment, then analysis under this assumption broadly addresses the de jure, or “while on treatment strategy” estimand. In such cases, we may well wish to explore the robustness of our inference to more pragmatic, de facto or “treatment policy strategy,” assumptions about the behaviour of patients post‐censoring. This is particularly the case when censoring occurs because patients change, or revert, to the usual (ie, reference) standard of care. Recent work has shown how such questions can be addressed for trials with continuous outcome data and longitudinal follow‐up, using reference‐based multiple imputation. For example, patients in the active arm may have their missing data imputed assuming they reverted to the control (ie, reference) intervention on withdrawal. Reference‐based imputation has two advantages: (a) it avoids the user specifying numerous parameters describing the distribution of patients' postwithdrawal data and (b) it is, to a good approximation, information anchored, so that the proportion of information lost due to missing data under the primary analysis is held constant across the sensitivity analyses. In this article, we build on recent work in the survival context, proposing a class of reference‐based assumptions appropriate for time‐to‐event data. We report a simulation study exploring the extent to which the multiple imputation estimator (using Rubin's variance formula) is information anchored in this setting and then illustrate the approach by reanalysing data from a randomized trial, which compared medical therapy with angioplasty for patients presenting with angina. 相似文献