期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The statistician's role in the prevention of missing data

S Hughes J Harris N Flack RL Cuffe 《Pharmaceutical statistics》2012,11(5):410-416

Considerable statistical research has been performed in recent years to develop sophisticated statistical methods for handling missing data and dropouts in the analysis of clinical trial data. However, if statisticians and other study team members proactively set out at the trial initiation stage to assess the impact of missing data and investigate ways to reduce dropouts, there is considerable potential to improve the clarity and quality of trial results and also increase efficiency. This paper presents a Human Immunodeficiency Virus (HIV) case study where statisticians led a project to reduce dropouts. The first step was to perform a pooled analysis of past HIV trials investigating which patient subgroups are more likely to drop out. The second step was to educate internal and external trial staff at all levels about the patient types more likely to dropout, and the impact this has on data quality and sample sizes required. The final step was to work collaboratively with clinical trial teams to create proactive plans regarding focused retention efforts, identifying ways to increase retention particularly in patients most at risk. It is acknowledged that identifying the specific impact of new patient retention efforts/tools is difficult because patient retention can be influenced by overall study design, investigational product tolerability profile, current standard of care and treatment access for the disease under study, which may vary over time. However, the implementation of new retention strategies and efforts within clinical trial teams attests to the influence of the analyses described in this case study. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献

2.

Missing data in clinical trials: control‐based mean imputation and sensitivity analysis

下载免费PDF全文

Devan V. Mehrotra Fang Liu Thomas Permutt 《Pharmaceutical statistics》2017,16(5):378-392

In some randomized (drug versus placebo) clinical trials, the estimand of interest is the between‐treatment difference in population means of a clinical endpoint that is free from the confounding effects of “rescue” medication (e.g., HbA1c change from baseline at 24 weeks that would be observed without rescue medication regardless of whether or when the assigned treatment was discontinued). In such settings, a missing data problem arises if some patients prematurely discontinue from the trial or initiate rescue medication while in the trial, the latter necessitating the discarding of post‐rescue data. We caution that the commonly used mixed‐effects model repeated measures analysis with the embedded missing at random assumption can deliver an exaggerated estimate of the aforementioned estimand of interest. This happens, in part, due to implicit imputation of an overly optimistic mean for “dropouts” (i.e., patients with missing endpoint data of interest) in the drug arm. We propose an alternative approach in which the missing mean for the drug arm dropouts is explicitly replaced with either the estimated mean of the entire endpoint distribution under placebo (primary analysis) or a sequence of increasingly more conservative means within a tipping point framework (sensitivity analysis); patient‐level imputation is not required. A supplemental “dropout = failure” analysis is considered in which a common poor outcome is imputed for all dropouts followed by a between‐treatment comparison using quantile regression. All analyses address the same estimand and can adjust for baseline covariates. Three examples and simulation results are used to support our recommendations. 相似文献

3.

Handling of missing data in long‐term clinical trials: a case study

Mark Janssens Geert Molenberghs René Kerstens 《Pharmaceutical statistics》2012,11(6):442-448

Missing data in clinical trials is a well‐known problem, and the classical statistical methods used can be overly simple. This case study shows how well‐established missing data theory can be applied to efficacy data collected in a long‐term open‐label trial with a discontinuation rate of almost 50%. Satisfaction with treatment in chronically constipated patients was the efficacy measure assessed at baseline and every 3 months postbaseline. The improvement in treatment satisfaction from baseline was originally analyzed with a paired t‐test ignoring missing data and discarding the correlation structure of the longitudinal data. As the original analysis started from missing completely at random assumptions regarding the missing data process, the satisfaction data were re‐examined, and several missing at random (MAR) and missing not at random (MNAR) techniques resulted in adjusted estimate for the improvement in satisfaction over 12 months. Throughout the different sensitivity analyses, the effect sizes remained significant and clinically relevant. Thus, even for an open‐label trial design, sensitivity analysis, with different assumptions for the nature of dropouts (MAR or MNAR) and with different classes of models (selection, pattern‐mixture, or multiple imputation models), has been found useful and provides evidence towards the robustness of the original analyses; additional sensitivity analyses could be undertaken to further qualify robustness. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献

4.

Empirical evaluation of the implementation of the EMA guideline on missing data in confirmatory clinical trials: Specification of mixed models for longitudinal data in study protocols

Sebastian Hckl Armin Koch Florian Lasch 《Pharmaceutical statistics》2019,18(6):636-644

In confirmatory clinical trials, the prespecification of the primary analysis model is a universally accepted scientific principle to allow strict control of the type I error. Consequently, both the ICH E9 guideline and the European Medicines Agency (EMA) guideline on missing data in confirmatory clinical trials require that the primary analysis model is defined unambiguously. This requirement applies to mixed models for longitudinal data handling missing data implicitly. To evaluate the compliance with the EMA guideline, we evaluated the model specifications in those clinical study protocols from development phases II and III submitted between 2015 and 2018 to the Ethics Committee at Hannover Medical School under the German Medicinal Products Act, which planned to use a mixed model for longitudinal data in the confirmatory testing strategy. Overall, 39 trials from different types of sponsors and a wide range of therapeutic areas were evaluated. While nearly all protocols specify the fixed and random effects of the analysis model (95%), only 77% give the structure of the covariance matrix used for modeling the repeated measurements. Moreover, the testing method (36%), the estimation method (28%), the computation method (3%), and the fallback strategy (18%) are given by less than half the study protocols. Subgroup analyses indicate that these findings are universal and not specific to clinical trial phases or size of company. Altogether, our results show that guideline compliance is to various degrees poor and consequently, strict type I error rate control at the intended level is not guaranteed. 相似文献

5.

Time-varying coefficient cumulative gap time models for intensive longitudinal ecological momentary assessment data with missingness

Xiaoxue Li Stewart J. Anderson Saul Shiffman Bo Zhang 《Journal of applied statistics》2022,49(2):498

Ecological momentary assessment (EMA) studies investigate intensive repeated observations of the current behavior and experiences of subjects in real time. In particular, such studies aim to minimize recall bias and maximize ecological validity, thereby strengthening the investigation and inference of microprocesses that influence behavior in real-world contexts by gathering intensive information on the temporal patterning of behavior of study subjects. Throughout this paper, we focus on the data analysis of an EMA study that examined behavior of intermittent smokers (ITS). Specifically, we sought to explore the pattern of clustered smoking behavior of ITS, or smoking ‘bouts’, as well as the covariates that predict such smoking behavior. To do this, in this paper we introduce a framework for characterizing the temporal behavior of ITS via the functions of event gap time to distinguish the smoking bouts. We used the time-varying coefficient models for the cumulative log gap time and to characterize the temporal patterns of smoking behavior, while simultaneously adjusting for behavioral covariates, and incorporated the inverse probability weighting into the models to accommodate missing data. Simulation studies showed that irrespective of whether missing by design or missing at random, the model was able to reliably determine prespecified time-varying functional forms of a given covariate coefficient, provided the the within-subject level was small. 相似文献

6.

Robust estimation of mean and covariance for longitudinal data with dropouts

Guoyou Qin 《Journal of applied statistics》2015,42(6):1240-1254

In this paper, we study estimation of linear models in the framework of longitudinal data with dropouts. Under the assumptions that random errors follow an elliptical distribution and all the subjects share the same within-subject covariance matrix which does not depend on covariates, we develop a robust method for simultaneous estimation of mean and covariance. The proposed method is robust against outliers, and does not require to model the covariance and missing data process. Theoretical properties of the proposed estimator are established and simulation studies show its good performance. In the end, the proposed method is applied to a real data analysis for illustration. 相似文献

7.

Treatment policy estimands for recurrent event data using data collected after cessation of randomised treatment

James H. Roger Daniel J. Bratton Bhabita Mayer Juan J. Abellan Oliver N. Keene 《Pharmaceutical statistics》2019,18(1):85-95

In the past, many clinical trials have withdrawn subjects from the study when they prematurely stopped their randomised treatment and have therefore only collected ‘on‐treatment’ data. Thus, analyses addressing a treatment policy estimand have been restricted to imputing missing data under assumptions drawn from these data only. Many confirmatory trials are now continuing to collect data from subjects in a study even after they have prematurely discontinued study treatment as this event is irrelevant for the purposes of a treatment policy estimand. However, despite efforts to keep subjects in a trial, some will still choose to withdraw. Recent publications for sensitivity analyses of recurrent event data have focused on the reference‐based imputation methods commonly applied to continuous outcomes, where imputation for the missing data for one treatment arm is based on the observed outcomes in another arm. However, the existence of data from subjects who have prematurely discontinued treatment but remained in the study has now raised the opportunity to use this ‘off‐treatment’ data to impute the missing data for subjects who withdraw, potentially allowing more plausible assumptions for the missing post‐study‐withdrawal data than reference‐based approaches. In this paper, we introduce a new imputation method for recurrent event data in which the missing post‐study‐withdrawal event rate for a particular subject is assumed to reflect that observed from subjects during the off‐treatment period. The method is illustrated in a trial in chronic obstructive pulmonary disease (COPD) where the primary endpoint was the rate of exacerbations, analysed using a negative binomial model. 相似文献

8.

Control‐based imputation for sensitivity analyses in informative censoring for recurrent event data

下载免费PDF全文

Fei Gao Guanghan F. Liu Donglin Zeng Lei Xu Bridget Lin Guoqing Diao Gregory Golm Joseph F. Heyse Joseph G. Ibrahim 《Pharmaceutical statistics》2017,16(6):424-432

In clinical trials, missing data commonly arise through nonadherence to the randomized treatment or to study procedure. For trials in which recurrent event endpoints are of interests, conventional analyses using the proportional intensity model or the count model assume that the data are missing at random, which cannot be tested using the observed data alone. Thus, sensitivity analyses are recommended. We implement the control‐based multiple imputation as sensitivity analyses for the recurrent event data. We model the recurrent event using a piecewise exponential proportional intensity model with frailty and sample the parameters from the posterior distribution. We impute the number of events after dropped out and correct the variance estimation using a bootstrap procedure. We apply the method to an application of sitagliptin study. 相似文献

9.

A Bayesian model for longitudinal count data with non-ignorable dropout

Kaciroti NA Raghunathan TE Schork MA Clark NM 《Journal of the Royal Statistical Society. Series C, Applied statistics》2008,57(5):521-534

Asthma is an important chronic disease of childhood. An intervention programme for managing asthma was designed on principles of self-regulation and was evaluated by a randomized longitudinal study.The study focused on several outcomes, and, typically, missing data remained a pervasive problem. We develop a pattern-mixture model to evaluate the outcome of intervention on the number of hospitalizations with non-ignorable dropouts. Pattern-mixture models are not generally identifiable as no data may be available to estimate a number of model parameters. Sensitivity analyses are performed by imposing structures on the unidentified parameters.We propose a parameterization which permits sensitivity analyses on clustered longitudinal count data that have missing values due to non-ignorable missing data mechanisms. This parameterization is expressed as ratios between event rates across missing data patterns and the observed data pattern and thus measures departures from an ignorable missing data mechanism. Sensitivity analyses are performed within a Bayesian framework by averaging over different prior distributions on the event ratios. This model has the advantage of providing an intuitive and flexible framework for incorporating the uncertainty of the missing data mechanism in the final analysis. 相似文献

10.

An R package for model fitting,model selection and the simulation for longitudinal data with dropout missingness

Cong Xu Zheng Li Yuan Xue Lijun Zhang 《统计学通讯:模拟与计算》2013,42(9):2812-2829

Abstract

Missing data arise frequently in clinical and epidemiological fields, in particular in longitudinal studies. This paper describes the core features of an R package wgeesel, which implements marginal model fitting (i.e., weighted generalized estimating equations, WGEE; doubly robust GEE) for longitudinal data with dropouts under the assumption of missing at random. More importantly, this package comprehensively provide existing information criteria for WGEE model selection on marginal mean or correlation structures. Also, it can serve as a valuable tool for simulating longitudinal data with missing outcomes. Lastly, a real data example and simulations are presented to illustrate and validate our package. 相似文献

11.

Variable selection for longitudinal data with high-dimensional covariates and dropouts

Xueying Zheng Bo Fu Jiajia Zhang 《Journal of Statistical Computation and Simulation》2018,88(4):712-725

A new variable selection approach utilizing penalized estimating equations is developed for high-dimensional longitudinal data with dropouts under a missing at random (MAR) mechanism. The proposed method is based on the best linear approximation of efficient scores from the full dataset and does not need to specify a separate model for the missing or imputation process. The coordinate descent algorithm is adopted to implement the proposed method and is computational feasible and stable. The oracle property is established and extensive simulation studies show that the performance of the proposed variable selection method is much better than that of penalized estimating equations dealing with complete data which do not account for the MAR mechanism. In the end, the proposed method is applied to a Lifestyle Education for Activity and Nutrition study and the interaction effect between intervention and time is identified, which is consistent with previous findings. 相似文献

12.

Application of sensitivity analysis to incomplete longitudinal CD4 count data

Abdul-Karim Iddrisu Freedom Gumedze 《Journal of applied statistics》2019,46(4):754-769

In this paper, we investigate the effect of tuberculosis pericarditis (TBP) treatment on CD4 count changes over time and draw inferences in the presence of missing data. We accounted for missing data and conducted sensitivity analyses to assess whether inferences under missing at random (MAR) assumption are sensitive to not missing at random (NMAR) assumptions using the selection model (SeM) framework. We conducted sensitivity analysis using the local influence approach and stress-testing analysis. Our analyses showed that the inferences from the MAR are robust to the NMAR assumption and influential subjects do not overturn the study conclusions about treatment effects and the dropout mechanism. Therefore, the missing CD4 count measurements are likely to be MAR. The results also revealed that TBP treatment does not interact with HIV/AIDS treatment and that TBP treatment has no significant effect on CD4 count changes over time. Although the methods considered were applied to data in the IMPI trial setting, the methods can also be applied to clinical trials with similar settings. 相似文献

13.

Evaluation of incomplete multiple diagnostic tests,with an application in the colon cancer family registry study

Yi Zhang Haitao Chu Donglin Zeng 《Journal of applied statistics》2014,41(3):688-700

Accurate diagnosis of a molecularly defined subtype of cancer is often an important step toward its effective control and treatment. For the diagnosis of some subtypes of a cancer, a gold standard with perfect sensitivity and specificity may be unavailable. In those scenarios, tumor subtype status is commonly measured by multiple imperfect diagnostic markers. Additionally, in many such studies, some subjects are only measured by a subset of diagnostic tests and the missing probabilities may depend on the unknown disease status. In this paper, we present statistical methods based on the EM algorithm to evaluate incomplete multiple imperfect diagnostic tests under a missing at random assumption and one missing not at random scenario. We apply the proposed methods to a real data set from the National Cancer Institute (NCI) colon cancer family registry on diagnosing microsatellite instability for hereditary non-polyposis colorectal cancer to estimate diagnostic accuracy parameters (i.e. sensitivities and specificities), prevalence, and potential differential missing probabilities for 11 biomarker tests. Simulations are also conducted to evaluate the small-sample performance of our methods. 相似文献

14.

Identifying treatment effects using trimmed means when data are missing not at random

Alex Ocampo Heinz Schmidli Peter Quarg Francesca Callegari Marcello Pagano 《Pharmaceutical statistics》2021,20(6):1265-1277

Patients often discontinue from a clinical trial because their health condition is not improving or they cannot tolerate the assigned treatment. Consequently, the observed clinical outcomes in the trial are likely better on average than if every patient had completed the trial. If these differences between trial completers and non-completers cannot be explained by the observed data, then the study outcomes are missing not at random (MNAR). One way to overcome this problem—the trimmed means approach for missing data due to study discontinuation—sets missing values as the worst observed outcome and then trims away a fraction of the distribution from each treatment arm before calculating differences in treatment efficacy (Permutt T, Li F. Trimmed means for symptom trials with dropouts. Pharm Stat. 2017;16(1):20–28). In this paper, we derive sufficient and necessary conditions for when this approach can identify the average population treatment effect. Simulation studies show the trimmed means approach's ability to effectively estimate treatment efficacy when data are MNAR and missingness due to study discontinuation is strongly associated with an unfavorable outcome, but trimmed means fail when data are missing at random. If the reasons for study discontinuation in a clinical trial are known, analysts can improve estimates with a combination of multiple imputation and the trimmed means approach when the assumptions of each hold. We compare the methodology to existing approaches using data from a clinical trial for chronic pain. An R package trim implements the method. When the assumptions are justifiable, using trimmed means can help identify treatment effects notwithstanding MNAR data. 相似文献

15.

Validity and efficiency in analyzing ordinal responses with missing observations

Xichen She Changbao Wu 《Revue canadienne de statistique》2020,48(2):138-151

This article addresses issues in creating public-use data files in the presence of missing ordinal responses and subsequent statistical analyses of the dataset by users. The authors propose a fully efficient fractional imputation (FI) procedure for ordinal responses with missing observations. The proposed imputation strategy retrieves the missing values through the full conditional distribution of the response given the covariates and results in a single imputed data file that can be analyzed by different data users with different scientific objectives. Two most critical aspects of statistical analyses based on the imputed data set, validity and efficiency, are examined through regression analysis involving the ordinal response and a selected set of covariates. It is shown through both theoretical development and simulation studies that, when the ordinal responses are missing at random, the proposed FI procedure leads to valid and highly efficient inferences as compared to existing methods. Variance estimation using the fractionally imputed data set is also discussed. The Canadian Journal of Statistics 48: 138–151; 2020 © 2019 Statistical Society of Canada 相似文献

16.

Missing Data Mechanisms for Analysing Longitudinal Data with Incomplete Observations in Both Responses and Covariates

下载免费PDF全文

Haocheng Li Grace Y. Yi 《Australian & New Zealand Journal of Statistics》2016,58(3):377-396

Missing observations in both responses and covariates arise frequently in longitudinal studies. When missing data are missing not at random, inferences under the likelihood framework often require joint modelling of response and covariate processes, as well as missing data processes associated with incompleteness of responses and covariates. Specification of these four joint distributions is a nontrivial issue from the perspectives of both modelling and computation. To get around this problem, we employ pairwise likelihood formulations, which avoid the specification of third or higher order association structures. In this paper, we consider three specific missing data mechanisms which lead to further simplified pairwise likelihood (SPL) formulations. Under these missing data mechanisms, inference methods based on SPL formulations are developed. The resultant estimators are consistent, and enjoy better robustness and computation convenience. The performance is evaluated empirically though simulation studies. Longitudinal data from the National Population Health Survey and Waterloo Smoking Prevention Project are analysed to illustrate the usage of our methods. 相似文献

17.

Sensitivity analyses for partially observed recurrent event data

下载免费PDF全文

Mouna Akacha Emmanuel O. Ogundimu 《Pharmaceutical statistics》2016,15(1):4-14

Recurrent events involve the occurrences of the same type of event repeatedly over time and are commonly encountered in longitudinal studies. Examples include seizures in epileptic studies or occurrence of cancer tumors. In such studies, interest lies in the number of events that occur over a fixed period of time. One considerable challenge in analyzing such data arises when a large proportion of patients discontinues before the end of the study, for example, because of adverse events, leading to partially observed data. In this situation, data are often modeled using a negative binomial distribution with time‐in‐study as offset. Such an analysis assumes that data are missing at random (MAR). As we cannot test the adequacy of MAR, sensitivity analyses that assess the robustness of conclusions across a range of different assumptions need to be performed. Sophisticated sensitivity analyses for continuous data are being frequently performed. However, this is less the case for recurrent event or count data. We will present a flexible approach to perform clinically interpretable sensitivity analyses for recurrent event data. Our approach fits into the framework of reference‐based imputations, where information from reference arms can be borrowed to impute post‐discontinuation data. Different assumptions about the future behavior of dropouts dependent on reasons for dropout and received treatment can be made. The imputation model is based on a flexible model that allows for time‐varying baseline intensities. We assess the performance in a simulation study and provide an illustration with a clinical trial in patients who suffer from bladder cancer. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

18.

An estimated‐score approach for dealing with missing covariate data in matched case–control studies

Samiran Sinha 《Revue canadienne de statistique》2010,38(4):680-697

Matched case–control designs are commonly used in epidemiological studies for estimating the effect of exposure variables on the risk of a disease by controlling the effect of confounding variables. Due to retrospective nature of the study, information on a covariate could be missing for some subjects. A straightforward application of the conditional logistic likelihood for analyzing matched case–control data with the partially missing covariate may yield inefficient estimators of the parameters. A robust method has been proposed to handle this problem using an estimated conditional score approach when the missingness mechanism does not depend on the disease status. Within the conditional logistic likelihood framework, an empirical procedure is used to estimate the odds of the disease for the subjects with missing covariate values. The asymptotic distribution and the asymptotic variance of the estimator when the matching variables and the completely observed covariates are categorical. The finite sample performance of the proposed estimator is assessed through a simulation study. Finally, the proposed method has been applied to analyze two matched case–control studies. The Canadian Journal of Statistics 38: 680–697; 2010 © 2010 Statistical Society of Canada 相似文献

19.

Kernel Smoothing Density Estimation when Group Membership is Subject to Missing

Tang W He H Gunzler D 《Journal of statistical planning and inference》2012,142(3):685-694

Density function is a fundamental concept in data analysis. Non-parametric methods including kernel smoothing estimate are available if the data is completely observed. However, in studies such as diagnostic studies following a two-stage design the membership of some of the subjects may be missing. Simply ignoring those subjects with unknown membership is valid only in the MCAR situation. In this paper, we consider kernel smoothing estimate of the density functions, using the inverse probability approaches to address the missing values. We illustrate the approaches with simulation studies and real study data in mental health. 相似文献

20.

GRAPHICAL SENSITIVITY ANALYSIS WITH DIFFERENT METHODS OF IMPUTATION FOR A TRIAL WITH PROBABLE NON‐IGNORABLE MISSING DATA

M. Weatherall R.M. Pickering Scott Harris 《Australian & New Zealand Journal of Statistics》2009,51(4):397-413

Graphical sensitivity analyses have recently been recommended for clinical trials with non‐ignorable missing outcome. We demonstrate an adaptation of this methodology for a continuous outcome of a trial of three cognitive‐behavioural therapies for mild depression in primary care, in which one arm had unexpectedly high levels of missing data. Fixed‐value and multiple imputations from a normal distribution (assuming either varying mean and fixed standard deviation, or fixed mean and varying standard deviation) were used to obtain contour plots of the contrast estimates with their P‐values superimposed, their confidence intervals, and the root mean square errors. Imputation was based either on the outcome value alone, or on change from baseline. The plots showed fixed‐value imputation to be more sensitive than imputing from a normal distribution, but the normally distributed imputations were subject to sampling noise. The contours of the sensitivity plots were close to linear in appearance, with the slope approximately equal to the ratio of the proportions of subjects with missing data in each trial arm. 相似文献