期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Inferences on missing information under multiple imputation and two-stage multiple imputation

Ofer Harel 《Statistical Methodology》2007,4(1):75-89

In the presence of missing values, researchers may be interested in the rates of missing information. The rates of missing information are (a) important for assessing how the missing information contributes to inferential uncertainty about, Q, the population quantity of interest, (b) are an important component in the decision of the number of imputations, and (c) can be used to test model uncertainty and model fitting. In this article I will derive the asymptotic distribution of the rates of missing information in two scenarios: the conventional multiple imputation (MI), and the two-stage MI. Numerically I will show that the proposed asymptotic distribution agrees with the simulated one. I will also suggest the number of imputations needed to obtain reliable missing information rate estimates for each method, based on the asymptotic distribution. 相似文献

2.

A comparison of multiple imputation and doubly robust estimation for analyses with missing data 总被引：1，自引：0，他引：1

James R. Carpenter Michael G. Kenward Stijn Vansteelandt 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2006,169(3):571-584

Summary. Multiple imputation is now a well-established technique for analysing data sets where some units have incomplete observations. Provided that the imputation model is correct, the resulting estimates are consistent. An alternative, weighting by the inverse probability of observing complete data on a unit, is conceptually simple and involves fewer modelling assumptions, but it is known to be both inefficient (relative to a fully parametric approach) and sensitive to the choice of weighting model. Over the last decade, there has been a considerable body of theoretical work to improve the performance of inverse probability weighting, leading to the development of 'doubly robust' or 'doubly protected' estimators. We present an intuitive review of these developments and contrast these estimators with multiple imputation from both a theoretical and a practical viewpoint. 相似文献

3.

Multiple imputation of missing data with ante-dependence covariance structure

Paul Zhang 《Journal of applied statistics》2005,32(2):141-155

A controlled clinical trial was conducted to investigate the efficacy effect of a chemical compound in the treatment of Premenstrual Dysphoric Disorder (PMDD). The data from the trial showed a non-monotone pattern of missing data and an ante-dependence covariance structure. A new analytical method for imputing the missing data with the ante-dependence covariance is proposed. The PMDD data are analysed by the non-imputation method and two imputation methods: the proposed method and the MCMC method. 相似文献

4.

Latent class models for longitudinal studies of the elderly with data missing at random

Beth A Reboussin Michael E Miller Kurt K Lohman & Thomas R Ten Have 《Journal of the Royal Statistical Society. Series C, Applied statistics》2002,51(1):69-90

The elderly population in the USA is expected to double in size by the year 2025, making longitudinal health studies of this population of increasing importance. The degree of loss to follow-up in studies of the elderly, which is often because elderly people cannot remain in the study, enter a nursing home or die, make longitudinal studies of this population problematic. We propose a latent class model for analysing multiple longitudinal binary health outcomes with multiple-cause non-response when the data are missing at random and a non-likelihood-based analysis is performed. We extend the estimating equations approach of Robins and co-workers to latent class models by reweighting the multiple binary longitudinal outcomes by the inverse probability of being observed. This results in consistent parameter estimates when the probability of non-response depends on observed outcomes and covariates (missing at random) assuming that the model for non-response is correctly specified. We extend the non-response model so that institutionalization, death and missingness due to failure to locate, refusal or incomplete data each have their own set of non-response probabilities. Robust variance estimates are derived which account for the use of a possibly misspecified covariance matrix, estimation of missing data weights and estimation of latent class measurement parameters. This approach is then applied to a study of lower body function among a subsample of the elderly participating in the 6-year Longitudinal Study of Aging. 相似文献

5.

Comparison of alternative imputation methods for ordinal data

Federica Cugnata 《统计学通讯:模拟与计算》2017,46(1):315-330

In this article, we compare alternative missing imputation methods in the presence of ordinal data, in the framework of CUB (Combination of Uniform and (shifted) Binomial random variable) models. Various imputation methods are considered, as are univariate and multivariate approaches. The first step consists of running a simulation study designed by varying the parameters of the CUB model, to consider and compare CUB models as well as other methods of missing imputation. We use real datasets on which to base the comparison between our approach and some general methods of missing imputation for various missing data mechanisms. 相似文献

6.

A cautionary case study of approaches to the treatment of missing data

Christopher Paul William M. Mason Daniel McCaffrey Sarah A. Fox 《Statistical Methods and Applications》2008,17(3):351-372

This article presents findings from a case study of different approaches to the treatment of missing data. Simulations based on data from the Los Angeles Mammography Promotion in Churches Program (LAMP) led the authors to the following cautionary conclusions about the treatment of missing data: (1) Automated selection of the imputation model in the use of full Bayesian multiple imputation can lead to unexpected bias in coefficients of substantive models. (2) Under conditions that occur in actual data, casewise deletion can perform less well than we were led to expect by the existing literature. (3) Relatively unsophisticated imputations, such as mean imputation and conditional mean imputation, performed better than the technical literature led us to expect. (4) To underscore points (1), (2), and (3), the article concludes that imputation models are substantive models, and require the same caution with respect to specificity and calculability. The research reported here was partially supported by National Institutes of Health, National Cancer Institute, R01 CA65879 (SAF). We thank Nicholas Wolfinger, Naihua Duan, John Adams, John Fox, and the anonymous referees for their thoughtful comments on earlier drafts. The responsibility for any remaining errors is ours alone. Benjamin Stein was exceptionally helpful in orchestrating the simulations at the labs of UCLA Social Science Computing. Michael Mitchell of the UCLA Academic Technology Services Statistical Consulting Group artfully created Fig. 1 using the Stata graphics language; we are most grateful. 相似文献

7.

Navigating choices when applying multiple imputation in the presence of multi-level categorical interaction effects

《Statistical Methodology》2015

Multiple imputation (MI) is an appealing option for handling missing data. When implementing MI, however, users need to make important decisions to obtain estimates with good statistical properties. One such decision involves the choice of imputation model–the joint modeling (JM) versus fully conditional specification (FCS) approach. Another involves the choice of method to handle interactions. These include imputing the interaction term as any other variable (active imputation), or imputing the main effects and then deriving the interaction (passive imputation). Our study investigates the best approach to perform MI in the presence of interaction effects involving two categorical variables. Such effects warrant special attention as they involve multiple correlated parameters that are handled differently under JM and FCS modeling. Through an extensive simulation study, we compared active, passive and an improved passive approach under FCS, as JM precludes passive imputation. We additionally compared JM and FCS techniques using active imputation. Performance between active and passive imputation was comparable. The improved passive approach proved superior to the other two particularly when the number of parameters corresponding to the interaction was large. JM without rounding and FCS using active imputation were also mostly comparable, with JM outperforming FCS when the number of parameters was large. In a direct comparison of JM active and FCS improved passive, the latter was the clear winner. We recommend improved passive imputation under FCS along with sensitivity analyses to handle multi-level interaction terms. 相似文献

8.

Comparisons of imputation methods with application to assess factors associated with self efficacy of physical activity in breast cancer survivors

Yunxi Zhang Ye Lin George Baum Karen M. Basen-Engquist Michael D. Swartz 《统计学通讯:模拟与计算》2013,42(8):2523-2537

ABSTRACT

Missing data are commonly encountered in self-reported measurements and questionnaires. It is crucial to treat missing values using appropriate method to avoid bias and reduction of power. Various types of imputation methods exist, but it is not always clear which method is preferred for imputation of data with non-normal variables. In this paper, we compared four imputation methods: mean imputation, quantile imputation, multiple imputation, and quantile regression multiple imputation (QRMI), using both simulated and real data investigating factors affecting self-efficacy in breast cancer survivors. The results displayed an advantage of using multiple imputation, especially QRMI when data are not normal. 相似文献

9.

Using multiple imputation to estimate cumulative distribution functions in longitudinal data analysis with data missing at random

Phillip Dinh 《Pharmaceutical statistics》2013,12(5):260-267

In longitudinal clinical studies, after randomization at baseline, subjects are followed for a period of time for development of symptoms. The interested inference could be the mean change from baseline to a particular visit in some lab values, the proportion of responders to some threshold category at a particular visit post baseline, or the time to some important event. However, in some applications, the interest may be in estimating the cumulative distribution function (CDF) at a fixed time point post baseline. When the data are fully observed, the CDF can be estimated by the empirical CDF. When patients discontinue prematurely during the course of the study, the empirical CDF cannot be directly used. In this paper, we use multiple imputation as a way to estimate the CDF in longitudinal studies when data are missing at random. The validity of the method is assessed on the basis of the bias and the Kolmogorov–Smirnov distance. The results suggest that multiple imputation yields less bias and less variability than the often used last observation carried forward method. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

10.

Inference for domains under imputation for missing survey data

David Haziza J. N. K. Rao 《Revue canadienne de statistique》2005,33(2):149-161

The authors study the estimation of domain totals and means under survey‐weighted regression imputation for missing items. They use two different approaches to inference: (i) design‐based with uniform response within classes; (ii) model‐assisted with ignorable response and an imputation model. They show that the imputed domain estimators are biased under (i) but approximately unbiased under (ii). They obtain a bias‐adjusted estimator that is approximately unbiased under (i) or (ii). They also derive linearization variance estimators. They report the results of a simulation study on the bias ratio and efficiency of alternative estimators, including a complete case estimator that requires the knowledge of response indicators. 相似文献

11.

Estimating household structure in ancient China by using historical data: a latent class analysis of partially missing patterns

Tim Futing Liao 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2004,167(1):125-139

Summary. Social data often contain missing information. The problem is inevitably severe when analysing historical data. Conventionally, researchers analyse complete records only. Listwise deletion not only reduces the effective sample size but also may result in biased estimation, depending on the missingness mechanism. We analyse household types by using population registers from ancient China (618–907 AD) by comparing a simple classification, a latent class model of the complete data and a latent class model of the complete and partially missing data assuming four types of ignorable and non-ignorable missingness mechanisms. The findings show that either a frequency classification or a latent class analysis using the complete records only yielded biased estimates and incorrect conclusions in the presence of partially missing data of a non-ignorable mechanism. Although simply assuming ignorable or non-ignorable missing data produced consistently similarly higher estimates of the proportion of complex households, a specification of the relationship between the latent variable and the degree of missingness by a row effect uniform association model helped to capture the missingness mechanism better and improved the model fit. 相似文献

12.

A multiple imputation approach to nonlinear mixed-effects models with covariate measurement errors and missing values

Wei Liu Shuyou Li 《Journal of applied statistics》2015,42(3):463-476

In longitudinal studies, nonlinear mixed-effects models have been widely applied to describe the intra- and the inter-subject variations in data. The inter-subject variation usually receives great attention and it may be partially explained by time-dependent covariates. However, some covariates may be measured with substantial errors and may contain missing values. We proposed a multiple imputation method, implemented by a Markov Chain Monte-Carlo method along with Gibbs sampler, to address the covariate measurement errors and missing data in nonlinear mixed-effects models. The multiple imputation method is illustrated in a real data example. Simulation studies show that the multiple imputation method outperforms the commonly used naive methods. 相似文献

13.

Control‐based imputation for sensitivity analyses in informative censoring for recurrent event data

下载免费PDF全文

Fei Gao Guanghan F. Liu Donglin Zeng Lei Xu Bridget Lin Guoqing Diao Gregory Golm Joseph F. Heyse Joseph G. Ibrahim 《Pharmaceutical statistics》2017,16(6):424-432

In clinical trials, missing data commonly arise through nonadherence to the randomized treatment or to study procedure. For trials in which recurrent event endpoints are of interests, conventional analyses using the proportional intensity model or the count model assume that the data are missing at random, which cannot be tested using the observed data alone. Thus, sensitivity analyses are recommended. We implement the control‐based multiple imputation as sensitivity analyses for the recurrent event data. We model the recurrent event using a piecewise exponential proportional intensity model with frailty and sample the parameters from the posterior distribution. We impute the number of events after dropped out and correct the variance estimation using a bootstrap procedure. We apply the method to an application of sitagliptin study. 相似文献

14.

Multiple imputation for ordinal longitudinal data with monotone missing data patterns

A.Y. Kombo H. Mwambi G. Molenberghs 《Journal of applied statistics》2017,44(2):270-287

Missing data often complicate the analysis of scientific data. Multiple imputation is a general purpose technique for analysis of datasets with missing values. The approach is applicable to a variety of missing data patterns but often complicated by some restrictions like the type of variables to be imputed and the mechanism underlying the missing data. In this paper, the authors compare the performance of two multiple imputation methods, namely fully conditional specification and multivariate normal imputation in the presence of ordinal outcomes with monotone missing data patterns. Through a simulation study and an empirical example, the authors show that the two methods are indeed comparable meaning any of the two may be used when faced with scenarios, at least, as the ones presented here. 相似文献

15.

Multiple imputation methods for recurrent event data with missing event category

Douglas E. Schaubel Jianwen Cai 《Revue canadienne de statistique》2006,34(4):677-692

Frequently in clinical and epidemiologic studies, the event of interest is recurrent (i.e., can occur more than once per subject). When the events are not of the same type, an analysis which accounts for the fact that events fall into different categories will often be more informative. Often, however, although event times may always be known, information through which events are categorized may potentially be missing. Complete‐case methods (whose application may require, for example, that events be censored when their category cannot be determined) are valid only when event categories are missing completely at random. This assumption is rather restrictive. The authors propose two multiple imputation methods for analyzing multiple‐category recurrent event data under the proportional means/rates model. The use of a proper or improper imputation technique distinguishes the two approaches. Both methods lead to consistent estimation of regression parameters even when the missingness of event categories depends on covariates. The authors derive the asymptotic properties of the estimators and examine their behaviour in finite samples through simulation. They illustrate their approach using data from an international study on dialysis. 相似文献

16.

Matched case–control data analyses with missing covariates

M. C. Paik & R. L. Sacco 《Journal of the Royal Statistical Society. Series C, Applied statistics》2000,49(1):145-156

We consider methods for analysing matched case–control data when some covariates ( W ) are completely observed but other covariates ( X ) are missing for some subjects. In matched case–control studies, the complete-record analysis discards completely observed subjects if none of their matching cases or controls are completely observed. We investigate an imputation estimate obtained by solving a joint estimating equation for log-odds ratios of disease and parameters in an imputation model. Imputation estimates for coefficients of W are shown to have smaller bias and mean-square error than do estimates from the complete-record analysis. 相似文献

17.

Statistical inference under imputation for proportional hazard model with missing covariates

Zhiping Qiu 《统计学通讯:理论与方法》2017,46(23):11575-11590

Missing covariate data are common in biomedical studies. In this article, by using the non parametric kernel regression technique, a new imputation approach is developed for the Cox-proportional hazard regression model with missing covariates. This method achieves the same efficiency as the fully augmented weighted estimators (Qi et al. 2005. Journal of the American Statistical Association, 100:1250) and has a simpler form. The asymptotic properties of the proposed estimator are derived and analyzed. The comparisons between the proposed imputation method and several other existing methods are conducted via a number of simulation studies and a mouse leukemia data. 相似文献

18.

A comparison of various software tools for dealing with missing data via imputation

《Journal of Statistical Computation and Simulation》2012,82(11):1653-1675

In real-life situations, we often encounter data sets containing missing observations. Statistical methods that address missingness have been extensively studied in recent years. One of the more popular approaches involves imputation of the missing values prior to the analysis, thereby rendering the data complete. Imputation broadly encompasses an entire scope of techniques that have been developed to make inferences about incomplete data, ranging from very simple strategies (e.g. mean imputation) to more advanced approaches that require estimation, for instance, of posterior distributions using Markov chain Monte Carlo methods. Additional complexity arises when the number of missingness patterns increases and/or when both categorical and continuous random variables are involved. Implementation of routines, procedures, or packages capable of generating imputations for incomplete data are now widely available. We review some of these in the context of a motivating example, as well as in a simulation study, under two missingness mechanisms (missing at random and missing not at random). Thus far, evaluation of existing implementations have frequently centred on the resulting parameter estimates of the prescribed model of interest after imputing the missing data. In some situations, however, interest may very well be on the quality of the imputed values at the level of the individual – an issue that has received relatively little attention. In this paper, we focus on the latter to provide further insight about the performance of the different routines, procedures, and packages in this respect. 相似文献

19.

A multiple imputation method for incomplete correlated ordinal data using multivariate probit models

Xiao Zhang Quanlin Li Karen Cropsey Xiaowei Yang Kui Zhang Thomas Belin 《统计学通讯:模拟与计算》2017,46(3):2360-2375

The multiple imputation technique has proven to be a useful tool in missing data analysis. We propose a Markov chain Monte Carlo method to conduct multiple imputation for incomplete correlated ordinal data using the multivariate probit model. We conduct a thorough simulation study to compare the performance of our proposed method with two available imputation methods – multivariate normal-based and chain equation methods for various missing data scenarios. For illustration, we present an application using the data from the smoking cessation treatment study for low-income community corrections smokers. 相似文献

20.

Comparison of several imputation methods for missing baseline data in propensity scores analysis of binary outcome

Brenda J. Crowe Ilya A. Lipkovich Ouhong Wang 《Pharmaceutical statistics》2010,9(4):269-279

We performed a simulation study comparing the statistical properties of the estimated log odds ratio from propensity scores analyses of a binary response variable, in which missing baseline data had been imputed using a simple imputation scheme (Treatment Mean Imputation), compared with three ways of performing multiple imputation (MI) and with a Complete Case analysis. MI that included treatment (treated/untreated) and outcome (for our analyses, outcome was adverse event [yes/no]) in the imputer's model had the best statistical properties of the imputation schemes we studied. MI is feasible to use in situations where one has just a few outcomes to analyze. We also found that Treatment Mean Imputation performed quite well and is a reasonable alternative to MI in situations where it is not feasible to use MI. Treatment Mean Imputation performed better than MI methods that did not include both the treatment and outcome in the imputer's model. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献