期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A comparison of multiple imputation and doubly robust estimation for analyses with missing data 总被引：1，自引：0，他引：1

James R. Carpenter Michael G. Kenward Stijn Vansteelandt 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2006,169(3):571-584

Summary. Multiple imputation is now a well-established technique for analysing data sets where some units have incomplete observations. Provided that the imputation model is correct, the resulting estimates are consistent. An alternative, weighting by the inverse probability of observing complete data on a unit, is conceptually simple and involves fewer modelling assumptions, but it is known to be both inefficient (relative to a fully parametric approach) and sensitive to the choice of weighting model. Over the last decade, there has been a considerable body of theoretical work to improve the performance of inverse probability weighting, leading to the development of 'doubly robust' or 'doubly protected' estimators. We present an intuitive review of these developments and contrast these estimators with multiple imputation from both a theoretical and a practical viewpoint. 相似文献

2.

Using multiple imputation to estimate cumulative distribution functions in longitudinal data analysis with data missing at random

Phillip Dinh 《Pharmaceutical statistics》2013,12(5):260-267

In longitudinal clinical studies, after randomization at baseline, subjects are followed for a period of time for development of symptoms. The interested inference could be the mean change from baseline to a particular visit in some lab values, the proportion of responders to some threshold category at a particular visit post baseline, or the time to some important event. However, in some applications, the interest may be in estimating the cumulative distribution function (CDF) at a fixed time point post baseline. When the data are fully observed, the CDF can be estimated by the empirical CDF. When patients discontinue prematurely during the course of the study, the empirical CDF cannot be directly used. In this paper, we use multiple imputation as a way to estimate the CDF in longitudinal studies when data are missing at random. The validity of the method is assessed on the basis of the bias and the Kolmogorov–Smirnov distance. The results suggest that multiple imputation yields less bias and less variability than the often used last observation carried forward method. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

3.

Multiple imputation of missing data with ante-dependence covariance structure

Paul Zhang 《Journal of applied statistics》2005,32(2):141-155

A controlled clinical trial was conducted to investigate the efficacy effect of a chemical compound in the treatment of Premenstrual Dysphoric Disorder (PMDD). The data from the trial showed a non-monotone pattern of missing data and an ante-dependence covariance structure. A new analytical method for imputing the missing data with the ante-dependence covariance is proposed. The PMDD data are analysed by the non-imputation method and two imputation methods: the proposed method and the MCMC method. 相似文献

4.

Coping with missing data in phase III pivotal registration trials: Tolvaptan in subjects with kidney disease,a case study

下载免费PDF全文

John Ouyang Kevin J. Carroll Gary Koch Junfang Li 《Pharmaceutical statistics》2017,16(4):250-266

Missing data cause challenging issues, particularly in phase III registration trials, as highlighted by the European Medicines Agency (EMA) and the US National Research Council. We explore, as a case study, how the issues from missing data were tackled in a double‐blind phase III trial in subjects with autosomal dominant polycystic kidney disease. A total of 1445 subjects were randomized in a 2:1 ratio to receive active treatment (tolvaptan), or placebo. The primary outcome, the rate of change in total kidney volume, favored tolvaptan (P < .0001). The key secondary efficacy endpoints of clinical progression of disease and rate of decline in kidney function also favored tolvaptan. However, as highlighted by Food and Drug Administration and EMA, the interpretation of results was hampered by a high number of unevenly distributed dropouts, particularly early dropouts. In this paper, we outline the analyses undertaken to address the issue of missing data thoroughly. “Tipping point analyses” were performed to explore how extreme and detrimental outcomes among subjects with missing data must be to overturn the positive treatment effect attained in those subjects who had complete data. Nonparametric rank‐based analyses were also performed accounting for missing data. In conclusion, straightforward and transparent analyses directly taking into account missing data convincingly support the robustness of the preplanned analyses on the primary and secondary endpoints. Tolvaptan was confirmed to be effective in slowing total kidney volume growth, which is considered an efficacy endpoint by EMA, and in lessening the decline in renal function in patients with autosomal dominant polycystic kidney disease. 相似文献

5.

Data-driven sensitivity analysis to detect missing data mechanism with applications to structural equation modelling

Mortaza Jamshidian Ke-Hai Yuan 《Journal of Statistical Computation and Simulation》2013,83(7):1344-1362

Missing data are a common problem in almost all areas of empirical research. Ignoring the missing data mechanism, especially when data are missing not at random (MNAR), can result in biased and/or inefficient inference. Because MNAR mechanism is not verifiable based on the observed data, sensitivity analysis is often used to assess it. Current sensitivity analysis methods primarily assume a model for the response mechanism in conjunction with a measurement model and examine sensitivity to missing data mechanism via the parameters of the response model. Recently, Jamshidian and Mata (Post-modelling sensitivity analysis to detect the effect of missing data mechanism, Multivariate Behav. Res. 43 (2008), pp. 432–452) introduced a new method of sensitivity analysis that does not require the difficult task of modelling the missing data mechanism. In this method, a single measurement model is fitted to all of the data and to a sub-sample of the data. Discrepancy in the parameter estimates obtained from the the two data sets is used as a measure of sensitivity to missing data mechanism. Jamshidian and Mata describe their method mainly in the context of detecting data that are missing completely at random (MCAR). They used a bootstrap type method, that relies on heuristic input from the researcher, to test for the discrepancy of the parameter estimates. Instead of using bootstrap, the current article obtains confidence interval for parameter differences on two samples based on an asymptotic approximation. Because it does not use bootstrap, the developed procedure avoids likely convergence problems with the bootstrap methods. It does not require heuristic input from the researcher and can be readily implemented in statistical software. The article also discusses methods of obtaining sub-samples that may be used to test missing at random in addition to MCAR. An application of the developed procedure to a real data set, from the first wave of an ongoing longitudinal study on aging, is presented. Simulation studies are performed as well, using two methods of missing data generation, which show promise for the proposed sensitivity method. One method of missing data generation is also new and interesting in its own right. 相似文献

6.

Multiple imputation methods for recurrent event data with missing event category

Douglas E. Schaubel Jianwen Cai 《Revue canadienne de statistique》2006,34(4):677-692

Frequently in clinical and epidemiologic studies, the event of interest is recurrent (i.e., can occur more than once per subject). When the events are not of the same type, an analysis which accounts for the fact that events fall into different categories will often be more informative. Often, however, although event times may always be known, information through which events are categorized may potentially be missing. Complete‐case methods (whose application may require, for example, that events be censored when their category cannot be determined) are valid only when event categories are missing completely at random. This assumption is rather restrictive. The authors propose two multiple imputation methods for analyzing multiple‐category recurrent event data under the proportional means/rates model. The use of a proper or improper imputation technique distinguishes the two approaches. Both methods lead to consistent estimation of regression parameters even when the missingness of event categories depends on covariates. The authors derive the asymptotic properties of the estimators and examine their behaviour in finite samples through simulation. They illustrate their approach using data from an international study on dialysis. 相似文献

7.

A multiple imputation approach to nonlinear mixed-effects models with covariate measurement errors and missing values

Wei Liu Shuyou Li 《Journal of applied statistics》2015,42(3):463-476

In longitudinal studies, nonlinear mixed-effects models have been widely applied to describe the intra- and the inter-subject variations in data. The inter-subject variation usually receives great attention and it may be partially explained by time-dependent covariates. However, some covariates may be measured with substantial errors and may contain missing values. We proposed a multiple imputation method, implemented by a Markov Chain Monte-Carlo method along with Gibbs sampler, to address the covariate measurement errors and missing data in nonlinear mixed-effects models. The multiple imputation method is illustrated in a real data example. Simulation studies show that the multiple imputation method outperforms the commonly used naive methods. 相似文献

8.

Missing data: Discussion points from the PSI missing data expert group

Tomasz Burzykowski James Carpenter Corneel Coens Daniel Evans Lesley France Mike Kenward Peter Lane James Matcham David Morgan Alan Phillips James Roger Brian Sullivan Ian White Ly‐Mee Yu of the PSI Missing Data Expert Group 《Pharmaceutical statistics》2010,9(4):288-297

The Points to Consider Document on Missing Data was adopted by the Committee of Health and Medicinal Products (CHMP) in December 2001. In September 2007 the CHMP issued a recommendation to review the document, with particular emphasis on summarizing and critically appraising the pattern of drop‐outs, explaining the role and limitations of the ‘last observation carried forward’ method and describing the CHMP's cautionary stance on the use of mixed models. In preparation for the release of the updated guidance document, statisticians in the Pharmaceutical Industry held a one‐day expert group meeting in September 2008. Topics that were debated included minimizing the extent of missing data and understanding the missing data mechanism, defining the principles for handling missing data and understanding the assumptions underlying different analysis methods. A clear message from the meeting was that at present, biostatisticians tend only to react to missing data. Limited pro‐active planning is undertaken when designing clinical trials. Missing data mechanisms for a trial need to be considered during the planning phase and the impact on the objectives assessed. Another area for improvement is in the understanding of the pattern of missing data observed during a trial and thus the missing data mechanism via the plotting of data; for example, use of Kaplan–Meier curves looking at time to withdrawal. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

9.

Application of an imputation method for variance estimation under pseudo-likelihood when missing data are NMAR

Amy M. Kwon 《统计学通讯:理论与方法》2017,46(14):6959-6966

When data are outcome-dependent non response, pseudo-likelihood yields consistent regression coefficients without specifying the missing data mechanism. However, it is onerous to derive parameter estimators including their standard errors from the regression coefficients under pseudo-likelihood (PL). The present study applies an imputation method to compute the asymptotic standard errors of parameter estimators. The proposed method is simpler than Delta method and it showed similar effect size of the standard errors to bootstrapping in simulation and application studies. 相似文献

10.

Sensitivity to imputation models and assumptions in receiver operating characteristic analysis with incomplete data

《Journal of Statistical Computation and Simulation》2012,82(17):3498-3511

Modern statistical methods using incomplete data have been increasingly applied in a wide variety of substantive problems. Similarly, receiver operating characteristic (ROC) analysis, a method used in evaluating diagnostic tests or biomarkers in medical research, has also been increasingly popular problem in both its development and application. While missing-data methods have been applied in ROC analysis, the impact of model mis-specification and/or assumptions (e.g. missing at random) underlying the missing data has not been thoroughly studied. In this work, we study the performance of multiple imputation (MI) inference in ROC analysis. Particularly, we investigate parametric and non-parametric techniques for MI inference under common missingness mechanisms. Depending on the coherency of the imputation model with the underlying data generation mechanism, our results show that MI generally leads to well-calibrated inferences under ignorable missingness mechanisms. 相似文献

11.

基于数据缺失率和缺失模式的多重插补误差研究

彭海艳李意芝孟利军《统计与决策》2022,(1)

文章通过多重插补方法对不同缺失率和缺失模式的多变量缺失样本进行插补,研究了多重插补误差与缺失率和缺失模式的依赖关系。结果表明,当缺失率为0~15%时,多重插补误差与缺失率呈线性关系;当缺失率大于15%时,两者呈偏离线性关系。多重插补误差与缺失模式的方差均值比呈正相关性,当方差均值比越大时,误差也越大。相似文献

12.

Efficient inverse probability weighting method for quantile regression with nonignorable missing data

Pu-Ying Zhao De-Peng Jiang 《Statistics》2017,51(2):363-386

Quantitle regression (QR) is a popular approach to estimate functional relations between variables for all portions of a probability distribution. Parameter estimation in QR with missing data is one of the most challenging issues in statistics. Regression quantiles can be substantially biased when observations are subject to missingness. We study several inverse probability weighting (IPW) estimators for parameters in QR when covariates or responses are subject to missing not at random. Maximum likelihood and semiparametric likelihood methods are employed to estimate the respondent probability function. To achieve nice efficiency properties, we develop an empirical likelihood (EL) approach to QR with the auxiliary information from the calibration constraints. The proposed methods are less sensitive to misspecified missing mechanisms. Asymptotic properties of the proposed IPW estimators are shown under general settings. The efficiency gain of EL-based IPW estimator is quantified theoretically. Simulation studies and a data set on the work limitation of injured workers from Canada are used to illustrated our proposed methodologies. 相似文献

13.

Bayesian methods for dealing with missing data problems

Zhihua Ma Guanghui Chen 《Journal of the Korean Statistical Society》2018,47(3):297-313

Missing data, a common but challenging issue in most studies, may lead to biased and inefficient inferences if handled inappropriately. As a natural and powerful way for dealing with missing data, Bayesian approach has received much attention in the literature. This paper reviews the recent developments and applications of Bayesian methods for dealing with ignorable and non-ignorable missing data. We firstly introduce missing data mechanisms and Bayesian framework for dealing with missing data, and then introduce missing data models under ignorable and non-ignorable missing data circumstances based on the literature. After that, important issues of Bayesian inference, including prior construction, posterior computation, model comparison and sensitivity analysis, are discussed. Finally, several future issues that deserve further research are summarized and concluded. 相似文献

14.

Missing data sensitivity analysis for recurrent event data using controlled imputation

Oliver N. Keene James H. Roger Benjamin F. Hartley Michael G. Kenward 《Pharmaceutical statistics》2014,13(4):258-264

Statistical analyses of recurrent event data have typically been based on the missing at random assumption. One implication of this is that, if data are collected only when patients are on their randomized treatment, the resulting de jure estimator of treatment effect corresponds to the situation in which the patients adhere to this regime throughout the study. For confirmatory analysis of clinical trials, sensitivity analyses are required to investigate alternative de facto estimands that depart from this assumption. Recent publications have described the use of multiple imputation methods based on pattern mixture models for continuous outcomes, where imputation for the missing data for one treatment arm (e.g. the active arm) is based on the statistical behaviour of outcomes in another arm (e.g. the placebo arm). This has been referred to as controlled imputation or reference‐based imputation. In this paper, we use the negative multinomial distribution to apply this approach to analyses of recurrent events and other similar outcomes. The methods are illustrated by a trial in severe asthma where the primary endpoint was rate of exacerbations and the primary analysis was based on the negative binomial model. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

15.

Comparison of several imputation methods for missing baseline data in propensity scores analysis of binary outcome

Brenda J. Crowe Ilya A. Lipkovich Ouhong Wang 《Pharmaceutical statistics》2010,9(4):269-279

We performed a simulation study comparing the statistical properties of the estimated log odds ratio from propensity scores analyses of a binary response variable, in which missing baseline data had been imputed using a simple imputation scheme (Treatment Mean Imputation), compared with three ways of performing multiple imputation (MI) and with a Complete Case analysis. MI that included treatment (treated/untreated) and outcome (for our analyses, outcome was adverse event [yes/no]) in the imputer's model had the best statistical properties of the imputation schemes we studied. MI is feasible to use in situations where one has just a few outcomes to analyze. We also found that Treatment Mean Imputation performed quite well and is a reasonable alternative to MI in situations where it is not feasible to use MI. Treatment Mean Imputation performed better than MI methods that did not include both the treatment and outcome in the imputer's model. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

16.

A simulation comparison of imputation methods for quantitative data in the presence of multiple data patterns

《Journal of Statistical Computation and Simulation》2012,82(18):3588-3619

相似文献

17.

Estimating the population mean in the case of missing data using simple random sampling

Carlos N. Bouza Amer Ibrahim Al-Omari 《Statistics》2013,47(2):279-290

In this paper, we suggest three new ratio estimators of the population mean using quartiles of the auxiliary variable when there are missing data from the sample units. The suggested estimators are investigated under the simple random sampling method. We obtain the mean square errors equations for these estimators. The suggested estimators are compared with the sample mean and ratio estimators in the case of missing data. Also, they are compared with estimators in Singh and Horn [Compromised imputation in survey sampling, Metrika 51 (2000), pp. 267–276], Singh and Deo [Imputation by power transformation, Statist. Papers 45 (2003), pp. 555–579], and Kadilar and Cingi [Estimators for the population mean in the case of missing data, Commun. Stat.-Theory Methods, 37 (2008), pp. 2226–2236] and present under which conditions the proposed estimators are more efficient than other estimators. In terms of accuracy and of the coverage of the bootstrap confidence intervals, the suggested estimators performed better than other estimators. 相似文献

18.

Multiple imputation of censored survival data in the presence of missing covariates using restricted mean survival time

Gurprit Grover 《Journal of applied statistics》2015,42(4):817-827

Missing covariates data with censored outcomes put a challenge in the analysis of clinical data especially in small sample settings. Multiple imputation (MI) techniques are popularly used to impute missing covariates and the data are then analyzed through methods that can handle censoring. However, techniques based on MI are available to impute censored data also but they are not much in practice. In the present study, we applied a method based on multiple imputation by chained equations to impute missing values of covariates and also to impute censored outcomes using restricted survival time in small sample settings. The complete data were then analyzed using linear regression models. Simulation studies and a real example of CHD data show that the present method produced better estimates and lower standard errors when applied on the data having missing covariate values and censored outcomes than the analysis of the data having censored outcome but excluding cases with missing covariates or the analysis when cases with missing covariate values and censored outcomes were excluded from the data (complete case analysis). 相似文献

19.

Latent class based multiple imputation approach for missing categorical data

Mulugeta Gebregziabher Stacia M. DeSantis 《Journal of statistical planning and inference》2010

In this paper we propose a latent class based multiple imputation approach for analyzing missing categorical covariate data in a highly stratified data model. In this approach, we impute the missing data assuming a latent class imputation model and we use likelihood methods to analyze the imputed data. Via extensive simulations, we study its statistical properties and make comparisons with complete case analysis, multiple imputation, saturated log-linear multiple imputation and the Expectation–Maximization approach under seven missing data mechanisms (including missing completely at random, missing at random and not missing at random). These methods are compared with respect to bias, asymptotic standard error, type I error, and 95% coverage probabilities of parameter estimates. Simulations show that, under many missingness scenarios, latent class multiple imputation performs favorably when jointly considering these criteria. A data example from a matched case–control study of the association between multiple myeloma and polymorphisms of the Inter-Leukin 6 genes is considered. 相似文献

20.

Bayesian comparison of diagnostic tests with largely non-informative missing data

Carlos Daniel Paulino 《Journal of Statistical Computation and Simulation》2019,89(10):1877-1886

This work was motivated by a real problem of comparing binary diagnostic tests based upon a gold standard, where the collected data showed that the large majority of classifications were incomplete and the feedback received from the medical doctors allowed us to consider the missingness as non-informative. Taking into account the degree of data incompleteness, we used a Bayesian approach via MCMC methods for drawing inferences of interest on accuracy measures. Its direct implementation by well-known software demonstrated serious problems of chain convergence. The difficulties were overcome by the proposal of a simple, efficient and easily adaptable data augmentation algorithm, performed through an ad hoc computer program. 相似文献