首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
    
In clinical trials, missing data commonly arise through nonadherence to the randomized treatment or to study procedure. For trials in which recurrent event endpoints are of interests, conventional analyses using the proportional intensity model or the count model assume that the data are missing at random, which cannot be tested using the observed data alone. Thus, sensitivity analyses are recommended. We implement the control‐based multiple imputation as sensitivity analyses for the recurrent event data. We model the recurrent event using a piecewise exponential proportional intensity model with frailty and sample the parameters from the posterior distribution. We impute the number of events after dropped out and correct the variance estimation using a bootstrap procedure. We apply the method to an application of sitagliptin study.  相似文献   

2.
Randomized controlled trials (RCTs) are the gold standard for evaluation of the efficacy and safety of investigational interventions. If every patient in an RCT were to adhere to the randomized treatment, one could simply analyze the complete data to infer the treatment effect. However, intercurrent events (ICEs) including the use of concomitant medication for unsatisfactory efficacy, treatment discontinuation due to adverse events, or lack of efficacy may lead to interventions that deviate from the original treatment assignment. Therefore, defining the appropriate estimand (the appropriate parameter to be estimated) based on the primary objective of the study is critical prior to determining the statistical analysis method and analyzing the data. The International Council for Harmonisation (ICH) E9 (R1), adopted on November 20, 2019, provided five strategies to define the estimand: treatment policy, hypothetical, composite variable, while on treatment, and principal stratum. In this article, we propose an estimand using a mix of strategies in handling ICEs. This estimand is an average of the “null” treatment difference for those with ICEs potentially related to safety and the treatment difference for the other patients if they would complete the assigned treatments. Two examples from clinical trials evaluating antidiabetes treatments are provided to illustrate the estimation of this proposed estimand and to compare it with the estimates for estimands using hypothetical and treatment policy strategies in handling ICEs.  相似文献   

3.
Statistical analyses of recurrent event data have typically been based on the missing at random assumption. One implication of this is that, if data are collected only when patients are on their randomized treatment, the resulting de jure estimator of treatment effect corresponds to the situation in which the patients adhere to this regime throughout the study. For confirmatory analysis of clinical trials, sensitivity analyses are required to investigate alternative de facto estimands that depart from this assumption. Recent publications have described the use of multiple imputation methods based on pattern mixture models for continuous outcomes, where imputation for the missing data for one treatment arm (e.g. the active arm) is based on the statistical behaviour of outcomes in another arm (e.g. the placebo arm). This has been referred to as controlled imputation or reference‐based imputation. In this paper, we use the negative multinomial distribution to apply this approach to analyses of recurrent events and other similar outcomes. The methods are illustrated by a trial in severe asthma where the primary endpoint was rate of exacerbations and the primary analysis was based on the negative binomial model. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

4.
文章通过多重插补方法对不同缺失率和缺失模式的多变量缺失样本进行插补,研究了多重插补误差与缺失率和缺失模式的依赖关系。结果表明,当缺失率为0~15%时,多重插补误差与缺失率呈线性关系;当缺失率大于15%时,两者呈偏离线性关系。多重插补误差与缺失模式的方差均值比呈正相关性,当方差均值比越大时,误差也越大。  相似文献   

5.
We present results of a Monte Carlo study comparing four methods of estimating the parameters of the logistic model logit (pr (Y = 1 | X, Z)) = α0 + α 1 X + α 2 Z where X and Z are continuous covariates and X is always observed but Z is sometimes missing. The four methods examined are 1) logistic regression using complete cases, 2) logistic regression with filled-in values of Z obtained from the regression of Z on X and Y, 3) logistic regression with filled-in values of Z and random error added, and 4) maximum likelihood estimation assuming the distribution of Z given X and Y is normal. Effects of different percent missing for Z and different missing value mechanisms on the bias and mean absolute deviation of the estimators are examined for data sets of N = 200 and N = 400.  相似文献   

6.
    
The analysis of time‐to‐event data typically makes the censoring at random assumption, ie, that—conditional on covariates in the model—the distribution of event times is the same, whether they are observed or unobserved (ie, right censored). When patients who remain in follow‐up stay on their assigned treatment, then analysis under this assumption broadly addresses the de jure, or “while on treatment strategy” estimand. In such cases, we may well wish to explore the robustness of our inference to more pragmatic, de facto or “treatment policy strategy,” assumptions about the behaviour of patients post‐censoring. This is particularly the case when censoring occurs because patients change, or revert, to the usual (ie, reference) standard of care. Recent work has shown how such questions can be addressed for trials with continuous outcome data and longitudinal follow‐up, using reference‐based multiple imputation. For example, patients in the active arm may have their missing data imputed assuming they reverted to the control (ie, reference) intervention on withdrawal. Reference‐based imputation has two advantages: (a) it avoids the user specifying numerous parameters describing the distribution of patients' postwithdrawal data and (b) it is, to a good approximation, information anchored, so that the proportion of information lost due to missing data under the primary analysis is held constant across the sensitivity analyses. In this article, we build on recent work in the survival context, proposing a class of reference‐based assumptions appropriate for time‐to‐event data. We report a simulation study exploring the extent to which the multiple imputation estimator (using Rubin's variance formula) is information anchored in this setting and then illustrate the approach by reanalysing data from a randomized trial, which compared medical therapy with angioplasty for patients presenting with angina.  相似文献   

7.
The analysis of clinical trials aiming to show symptomatic benefits is often complicated by the ethical requirement for rescue medication when the disease state of patients worsens. In type 2 diabetes trials, patients receive glucose‐lowering rescue medications continuously for the remaining trial duration, if one of several markers of glycemic control exceeds pre‐specified thresholds. This may mask differences in glycemic values between treatment groups, because it will occur more frequently in less effective treatment groups. Traditionally, the last pre‐rescue medication value was carried forward and analyzed as the end‐of‐trial value. The deficits of such simplistic single imputation approaches are increasingly recognized by regulatory authorities and trialists. We discuss alternative approaches and evaluate them through a simulation study. When the estimand of interest is the effect attributable to the treatments initially assigned at randomization, then our recommendation for estimation and hypothesis testing is to treat data after meeting rescue criteria as deterministically ‘missing’ at random, because initiation of rescue medication is determined by observed in‐trial values. An appropriate imputation of values after meeting rescue criteria is then possible either directly through multiple imputation or implicitly with a repeated measures model. Crucially, one needs to jointly impute or model all markers of glycemic control that can lead to the initiation of rescue medication. An alternative for hypothesis testing only are rank tests with outcomes from patients ‘requiring rescue medication’ ranked worst, and non‐rescued patients ranked according to final visit values. However, an appropriate ranking of not observed values may be controversial. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

8.
    
Recurrent events involve the occurrences of the same type of event repeatedly over time and are commonly encountered in longitudinal studies. Examples include seizures in epileptic studies or occurrence of cancer tumors. In such studies, interest lies in the number of events that occur over a fixed period of time. One considerable challenge in analyzing such data arises when a large proportion of patients discontinues before the end of the study, for example, because of adverse events, leading to partially observed data. In this situation, data are often modeled using a negative binomial distribution with time‐in‐study as offset. Such an analysis assumes that data are missing at random (MAR). As we cannot test the adequacy of MAR, sensitivity analyses that assess the robustness of conclusions across a range of different assumptions need to be performed. Sophisticated sensitivity analyses for continuous data are being frequently performed. However, this is less the case for recurrent event or count data. We will present a flexible approach to perform clinically interpretable sensitivity analyses for recurrent event data. Our approach fits into the framework of reference‐based imputations, where information from reference arms can be borrowed to impute post‐discontinuation data. Different assumptions about the future behavior of dropouts dependent on reasons for dropout and received treatment can be made. The imputation model is based on a flexible model that allows for time‐varying baseline intensities. We assess the performance in a simulation study and provide an illustration with a clinical trial in patients who suffer from bladder cancer. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

9.
An important evolution in the missing data arena has been the recognition of need for clarity in objectives. The objectives of primary focus in clinical trials can often be categorized as assessing efficacy or effectiveness. The present investigation illustrated a structured framework for choosing estimands and estimators when testing investigational drugs to treat the symptoms of chronic illnesses. Key issues were discussed and illustrated using a reanalysis of the confirmatory trials from a new drug application in depression. The primary analysis used a likelihood‐based approach to assess efficacy: mean change to the planned endpoint of the trial assuming patients stayed on drug. Secondarily, effectiveness was assessed using a multiple imputation approach. The imputation model—derived solely from the placebo group—was used to impute missing values for both the drug and placebo groups. Therefore, this so‐called placebo multiple imputation (a.k.a. controlled imputation) approach assumed patients had reduced benefit from the drug after discontinuing it. Results from the example data provided clear evidence of efficacy for the experimental drug and characterized its effectiveness. Data after discontinuation of study medication were not required for these analyses. Given the idiosyncratic nature of drug development, no estimand or approach is universally appropriate. However, the general practice of pairing efficacy and effectiveness estimands may often be useful in understanding the overall risks and benefits of a drug. Controlled imputation approaches, such as placebo multiple imputation, can be a flexible and transparent framework for formulating primary analyses of effectiveness estimands and sensitivity analyses for efficacy estimands. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

10.
The topic of this paper was prompted by a study for which one of us was the statistician. It was submitted to Annals of Internal Medicine. The paper had positive reviewer comment; however, the statistical reviewer stated that for the analysis to be acceptable for publication, the missing data had to be accounted for in the analysis through the use of baseline in a last observation carried forward imputation. We discuss the issues associated with this form of imputation and recommend that it should not be undertaken as a primary analysis.  相似文献   

11.
The Points to Consider Document on Missing Data was adopted by the Committee of Health and Medicinal Products (CHMP) in December 2001. In September 2007 the CHMP issued a recommendation to review the document, with particular emphasis on summarizing and critically appraising the pattern of drop‐outs, explaining the role and limitations of the ‘last observation carried forward’ method and describing the CHMP's cautionary stance on the use of mixed models. In preparation for the release of the updated guidance document, statisticians in the Pharmaceutical Industry held a one‐day expert group meeting in September 2008. Topics that were debated included minimizing the extent of missing data and understanding the missing data mechanism, defining the principles for handling missing data and understanding the assumptions underlying different analysis methods. A clear message from the meeting was that at present, biostatisticians tend only to react to missing data. Limited pro‐active planning is undertaken when designing clinical trials. Missing data mechanisms for a trial need to be considered during the planning phase and the impact on the objectives assessed. Another area for improvement is in the understanding of the pattern of missing data observed during a trial and thus the missing data mechanism via the plotting of data; for example, use of Kaplan–Meier curves looking at time to withdrawal. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

12.
In real-life situations, we often encounter data sets containing missing observations. Statistical methods that address missingness have been extensively studied in recent years. One of the more popular approaches involves imputation of the missing values prior to the analysis, thereby rendering the data complete. Imputation broadly encompasses an entire scope of techniques that have been developed to make inferences about incomplete data, ranging from very simple strategies (e.g. mean imputation) to more advanced approaches that require estimation, for instance, of posterior distributions using Markov chain Monte Carlo methods. Additional complexity arises when the number of missingness patterns increases and/or when both categorical and continuous random variables are involved. Implementation of routines, procedures, or packages capable of generating imputations for incomplete data are now widely available. We review some of these in the context of a motivating example, as well as in a simulation study, under two missingness mechanisms (missing at random and missing not at random). Thus far, evaluation of existing implementations have frequently centred on the resulting parameter estimates of the prescribed model of interest after imputing the missing data. In some situations, however, interest may very well be on the quality of the imputed values at the level of the individual – an issue that has received relatively little attention. In this paper, we focus on the latter to provide further insight about the performance of the different routines, procedures, and packages in this respect.  相似文献   

13.
    
When conducting research synthesis, the collection of studies that will be combined often do not measure the same set of variables, which creates missing data. When the studies to combine are longitudinal, missing data can occur on the observation‐level (time‐varying) or the subject‐level (non‐time‐varying). Traditionally, the focus of missing data methods for longitudinal data has been on missing observation‐level variables. In this paper, we focus on missing subject‐level variables and compare two multiple imputation approaches: a joint modeling approach and a sequential conditional modeling approach. We find the joint modeling approach to be preferable to the sequential conditional approach, except when the covariance structure of the repeated outcome for each individual has homogenous variance and exchangeable correlation. Specifically, the regression coefficient estimates from an analysis incorporating imputed values based on the sequential conditional method are attenuated and less efficient than those from the joint method. Remarkably, the estimates from the sequential conditional method are often less efficient than a complete case analysis, which, in the context of research synthesis, implies that we lose efficiency by combining studies. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

14.
In longitudinal clinical studies, after randomization at baseline, subjects are followed for a period of time for development of symptoms. The interested inference could be the mean change from baseline to a particular visit in some lab values, the proportion of responders to some threshold category at a particular visit post baseline, or the time to some important event. However, in some applications, the interest may be in estimating the cumulative distribution function (CDF) at a fixed time point post baseline. When the data are fully observed, the CDF can be estimated by the empirical CDF. When patients discontinue prematurely during the course of the study, the empirical CDF cannot be directly used. In this paper, we use multiple imputation as a way to estimate the CDF in longitudinal studies when data are missing at random. The validity of the method is assessed on the basis of the bias and the Kolmogorov–Smirnov distance. The results suggest that multiple imputation yields less bias and less variability than the often used last observation carried forward method. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

15.
It is cleared in recent researches that the raising of missing values in datasets is inevitable. Imputation of missing data is one of the several methods which have been introduced to overcome this issue. Imputation techniques are trying to answer the case of missing data by covering missing values with reasonable estimates permanently. There are a lot of benefits for these procedures rather than their drawbacks. The operation of these methods has not been clarified, which means that they provide mistrust among analytical results. One approach to evaluate the outcomes of the imputation process is estimating uncertainty in the imputed data. Nonparametric methods are appropriate to estimating the uncertainty when data are not followed by any particular distribution. This paper deals with a nonparametric method for estimation and testing the significance of the imputation uncertainty, which is based on Wilcoxon test statistic, and which could be employed for estimating the precision of the imputed values created by imputation methods. This proposed procedure could be employed to judge the possibility of the imputation process for datasets, and to evaluate the influence of proper imputation methods when they are utilized to the same dataset. This proposed approach has been compared with other nonparametric resampling methods, including bootstrap and jackknife to estimate uncertainty in the imputed data under the Bayesian bootstrap imputation method. The ideas supporting the proposed method are clarified in detail, and a simulation study, which indicates how the approach has been employed in practical situations, is illustrated.  相似文献   

16.
基于随机森林模型的分类数据缺失值插补   总被引:6,自引:1,他引:6  
缺失数据是影响调查问卷数据质量的重要因素,对调查问卷中的缺失值进行插补可以显著提高调查数据的质量。调查问卷的数据类型多以分类型数据为主,数据挖掘技术中的分类算法是处理属性分类问题的常用方法,随机森林模型是众多分类算法中精度较高的方法之一。将随机森林模型引入调查问卷缺失数据的插补研究中,提出了基于随机森林模型的分类数据缺失值插补方法,并根据不同的缺失模式探讨了相应的插补步骤。通过与其它方法的实证模拟比较,表明随机森林插补法得到的插补值准确度更优、可信度更高。  相似文献   

17.
Graphical sensitivity analyses have recently been recommended for clinical trials with non‐ignorable missing outcome. We demonstrate an adaptation of this methodology for a continuous outcome of a trial of three cognitive‐behavioural therapies for mild depression in primary care, in which one arm had unexpectedly high levels of missing data. Fixed‐value and multiple imputations from a normal distribution (assuming either varying mean and fixed standard deviation, or fixed mean and varying standard deviation) were used to obtain contour plots of the contrast estimates with their P‐values superimposed, their confidence intervals, and the root mean square errors. Imputation was based either on the outcome value alone, or on change from baseline. The plots showed fixed‐value imputation to be more sensitive than imputing from a normal distribution, but the normally distributed imputations were subject to sampling noise. The contours of the sensitivity plots were close to linear in appearance, with the slope approximately equal to the ratio of the proportions of subjects with missing data in each trial arm.  相似文献   

18.
In longitudinal studies, nonlinear mixed-effects models have been widely applied to describe the intra- and the inter-subject variations in data. The inter-subject variation usually receives great attention and it may be partially explained by time-dependent covariates. However, some covariates may be measured with substantial errors and may contain missing values. We proposed a multiple imputation method, implemented by a Markov Chain Monte-Carlo method along with Gibbs sampler, to address the covariate measurement errors and missing data in nonlinear mixed-effects models. The multiple imputation method is illustrated in a real data example. Simulation studies show that the multiple imputation method outperforms the commonly used naive methods.  相似文献   

19.
When modeling multilevel data, it is important to accurately represent the interdependence of observations within clusters. Ignoring data clustering may result in parameter misestimation. However, it is not well established to what degree parameter estimates are affected by model misspecification when applying missing data techniques (MDTs) to incomplete multilevel data. We compare the performance of three MDTs with incomplete hierarchical data. We consider the impact of imputation model misspecification on the quality of parameter estimates by employing multiple imputation under assumptions of a normal model (MI/NM) with two-level cross-sectional data when values are missing at random on the dependent variable at rates of 10%, 30%, and 50%. Five criteria are used to compare estimates from MI/NM to estimates from MI assuming a linear mixed model (MI/LMM) and maximum likelihood estimation to the same incomplete data sets. With 10% missing data (MD), techniques performed similarly for fixed-effects estimates, but variance components were biased with MI/NM. Effects of model misspecification worsened at higher rates of MD, with the hierarchical structure of the data markedly underrepresented by biased variance component estimates. MI/LMM and maximum likelihood provided generally accurate and unbiased parameter estimates but performance was negatively affected by increased rates of MD.  相似文献   

20.
Traditional factor analysis (FA) rests on the assumption of multivariate normality. However, in some practical situations, the data do not meet this assumption; thus, the statistical inference made from such data may be misleading. This paper aims at providing some new tools for the skew-normal (SN) FA model when missing values occur in the data. In such a model, the latent factors are assumed to follow a restricted version of multivariate SN distribution with additional shape parameters for accommodating skewness. We develop an analytically feasible expectation conditional maximization algorithm for carrying out parameter estimation and imputation of missing values under missing at random mechanisms. The practical utility of the proposed methodology is illustrated with two real data examples and the results are compared with those obtained from the traditional FA counterparts.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号