期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Multiple imputation methods for recurrent event data with missing event category

Douglas E. Schaubel Jianwen Cai 《Revue canadienne de statistique》2006,34(4):677-692

Frequently in clinical and epidemiologic studies, the event of interest is recurrent (i.e., can occur more than once per subject). When the events are not of the same type, an analysis which accounts for the fact that events fall into different categories will often be more informative. Often, however, although event times may always be known, information through which events are categorized may potentially be missing. Complete‐case methods (whose application may require, for example, that events be censored when their category cannot be determined) are valid only when event categories are missing completely at random. This assumption is rather restrictive. The authors propose two multiple imputation methods for analyzing multiple‐category recurrent event data under the proportional means/rates model. The use of a proper or improper imputation technique distinguishes the two approaches. Both methods lead to consistent estimation of regression parameters even when the missingness of event categories depends on covariates. The authors derive the asymptotic properties of the estimators and examine their behaviour in finite samples through simulation. They illustrate their approach using data from an international study on dialysis. 相似文献

2.

Comparison of alternative imputation methods for ordinal data

Federica Cugnata 《统计学通讯:模拟与计算》2017,46(1):315-330

In this article, we compare alternative missing imputation methods in the presence of ordinal data, in the framework of CUB (Combination of Uniform and (shifted) Binomial random variable) models. Various imputation methods are considered, as are univariate and multivariate approaches. The first step consists of running a simulation study designed by varying the parameters of the CUB model, to consider and compare CUB models as well as other methods of missing imputation. We use real datasets on which to base the comparison between our approach and some general methods of missing imputation for various missing data mechanisms. 相似文献

3.

A comparison of various software tools for dealing with missing data via imputation

《Journal of Statistical Computation and Simulation》2012,82(11):1653-1675

In real-life situations, we often encounter data sets containing missing observations. Statistical methods that address missingness have been extensively studied in recent years. One of the more popular approaches involves imputation of the missing values prior to the analysis, thereby rendering the data complete. Imputation broadly encompasses an entire scope of techniques that have been developed to make inferences about incomplete data, ranging from very simple strategies (e.g. mean imputation) to more advanced approaches that require estimation, for instance, of posterior distributions using Markov chain Monte Carlo methods. Additional complexity arises when the number of missingness patterns increases and/or when both categorical and continuous random variables are involved. Implementation of routines, procedures, or packages capable of generating imputations for incomplete data are now widely available. We review some of these in the context of a motivating example, as well as in a simulation study, under two missingness mechanisms (missing at random and missing not at random). Thus far, evaluation of existing implementations have frequently centred on the resulting parameter estimates of the prescribed model of interest after imputing the missing data. In some situations, however, interest may very well be on the quality of the imputed values at the level of the individual – an issue that has received relatively little attention. In this paper, we focus on the latter to provide further insight about the performance of the different routines, procedures, and packages in this respect. 相似文献

4.

Mann–Whitney test with empirical likelihood methods for pretest–posttest studies

Min Chen Mary E. Thompson 《Journal of nonparametric statistics》2016,28(2):360-374

Pretest–posttest studies are an important and popular method for assessing the effectiveness of a treatment or an intervention in many scientific fields. While the treatment effect, measured as the difference between the two mean responses, is of primary interest, testing the difference of the two distribution functions for the treatment and the control groups is also an important problem. The Mann–Whitney test has been a standard tool for testing the difference of distribution functions with two independent samples. We develop empirical likelihood-based (EL) methods for the Mann–Whitney test to incorporate the two unique features of pretest–posttest studies: (i) the availability of baseline information for both groups; and (ii) the structure of the data with missing by design. Our proposed methods combine the standard Mann–Whitney test with the EL method of Huang, Qin and Follmann [(2008), ‘Empirical Likelihood-Based Estimation of the Treatment Effect in a Pretest–Posttest Study’, Journal of the American Statistical Association, 103(483), 1270–1280], the imputation-based empirical likelihood method of Chen, Wu and Thompson [(2015), ‘An Imputation-Based Empirical Likelihood Approach to Pretest–Posttest Studies’, The Canadian Journal of Statistics accepted for publication], and the jackknife empirical likelihood method of Jing, Yuan and Zhou [(2009), ‘Jackknife Empirical Likelihood’, Journal of the American Statistical Association, 104, 1224–1232]. Theoretical results are presented and finite sample performances of proposed methods are evaluated through simulation studies. 相似文献

5.

A simulation comparison of imputation methods for quantitative data in the presence of multiple data patterns

《Journal of Statistical Computation and Simulation》2012,82(18):3588-3619

相似文献

6.

Some efficient random imputation methods

Graham Kalton Leslie Kish 《统计学通讯:理论与方法》2013,42(16):1919-1939

Imputation methods that assign a selection of respondents’ values for missing i tern nonresponses give rise to an addd,tional source of sampling variation, which we term imputation varLance , We examine the effect of imputation variance on the precision of the mean, and propose four procedures for sampling the rEespondents that reduce this additional variance. Two of the procedures employ improved sample designs through selection of respc,ndents by sampling without replacement and by stratified sampl;lng. The other two increase the sample base by the use of multiple imputations. 相似文献

7.

Using multiple imputation to estimate cumulative distribution functions in longitudinal data analysis with data missing at random

Phillip Dinh 《Pharmaceutical statistics》2013,12(5):260-267

In longitudinal clinical studies, after randomization at baseline, subjects are followed for a period of time for development of symptoms. The interested inference could be the mean change from baseline to a particular visit in some lab values, the proportion of responders to some threshold category at a particular visit post baseline, or the time to some important event. However, in some applications, the interest may be in estimating the cumulative distribution function (CDF) at a fixed time point post baseline. When the data are fully observed, the CDF can be estimated by the empirical CDF. When patients discontinue prematurely during the course of the study, the empirical CDF cannot be directly used. In this paper, we use multiple imputation as a way to estimate the CDF in longitudinal studies when data are missing at random. The validity of the method is assessed on the basis of the bias and the Kolmogorov–Smirnov distance. The results suggest that multiple imputation yields less bias and less variability than the often used last observation carried forward method. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

8.

Fei Gao Guanghan F. Liu Donglin Zeng Lei Xu Bridget Lin Guoqing Diao Gregory Golm Joseph F. Heyse Joseph G. Ibrahim 《Pharmaceutical statistics》2017,16(6):424-432

In clinical trials, missing data commonly arise through nonadherence to the randomized treatment or to study procedure. For trials in which recurrent event endpoints are of interests, conventional analyses using the proportional intensity model or the count model assume that the data are missing at random, which cannot be tested using the observed data alone. Thus, sensitivity analyses are recommended. We implement the control‐based multiple imputation as sensitivity analyses for the recurrent event data. We model the recurrent event using a piecewise exponential proportional intensity model with frailty and sample the parameters from the posterior distribution. We impute the number of events after dropped out and correct the variance estimation using a bootstrap procedure. We apply the method to an application of sitagliptin study. 相似文献

9.

A new method of imputation in survey sampling

Sarjinder Singh 《Statistics》2013,47(5):499-511

In this paper, an alternative estimator of population mean in the presence of non-response has been suggested which comes in the form of Walsh's estimator. The estimator of mean obtained from the proposed technique remains better than the estimators obtained from ratio or mean methods of imputation. The mean-squared error (MSE) of the resultant estimator is less than that of the estimator obtained on the basis of ratio method of imputation for the optimum choice of parameters. An estimator for estimating a parameter involved in the process of new method of imputation has been discussed. A suggestion to form ‘warm deck’ method of imputation has been suggested. The MSE expressions for the proposed estimators have been derived analytically and compared empirically. The work has been extended to the case of multi-auxiliary information to be used for imputation. Numerical illustrations are also provided. 相似文献

10.

A multiple imputation approach to nonlinear mixed-effects models with covariate measurement errors and missing values

Wei Liu Shuyou Li 《Journal of applied statistics》2015,42(3):463-476

In longitudinal studies, nonlinear mixed-effects models have been widely applied to describe the intra- and the inter-subject variations in data. The inter-subject variation usually receives great attention and it may be partially explained by time-dependent covariates. However, some covariates may be measured with substantial errors and may contain missing values. We proposed a multiple imputation method, implemented by a Markov Chain Monte-Carlo method along with Gibbs sampler, to address the covariate measurement errors and missing data in nonlinear mixed-effects models. The multiple imputation method is illustrated in a real data example. Simulation studies show that the multiple imputation method outperforms the commonly used naive methods. 相似文献

11.

Sensitivity to imputation models and assumptions in receiver operating characteristic analysis with incomplete data

《Journal of Statistical Computation and Simulation》2012,82(17):3498-3511

Modern statistical methods using incomplete data have been increasingly applied in a wide variety of substantive problems. Similarly, receiver operating characteristic (ROC) analysis, a method used in evaluating diagnostic tests or biomarkers in medical research, has also been increasingly popular problem in both its development and application. While missing-data methods have been applied in ROC analysis, the impact of model mis-specification and/or assumptions (e.g. missing at random) underlying the missing data has not been thoroughly studied. In this work, we study the performance of multiple imputation (MI) inference in ROC analysis. Particularly, we investigate parametric and non-parametric techniques for MI inference under common missingness mechanisms. Depending on the coherency of the imputation model with the underlying data generation mechanism, our results show that MI generally leads to well-calibrated inferences under ignorable missingness mechanisms. 相似文献

12.

Interval estimation of the joint survival function for successive duration times under left truncation and right censoring

《Journal of Statistical Computation and Simulation》2012,82(11):1265-1277

In incident cohort studies, survival data often include subjects who have had an initiate event at recruitment and may potentially experience two successive events (first and second) during the follow-up period. Since the second duration process becomes observable only if the first event has occurred, left truncation and dependent censoring arise if the two duration times are correlated. To confront the two potential sampling biases, we propose two inverse-probability-weighted (IPW) estimators for the estimation of the joint survival function of two successive duration times. One of them is similar to the estimator proposed by Chang and Tzeng [Nonparametric estimation of sojourn time distributions for truncated serial event data – a weight adjusted approach, Lifetime Data Anal. 12 (2006), pp. 53–67]. The other is the extension of the nonparametric estimator proposed by Wang and Wells [Nonparametric estimation of successive duration times under dependent censoring, Biometrika 85 (1998), pp. 561–572]. The weak convergence of both estimators are established. Furthermore, the delete-one jackknife and simple bootstrap methods are used to estimate standard deviations and construct interval estimators. A simulation study is conducted to compare the two IPW approaches. 相似文献

13.

Comparisons of imputation methods with application to assess factors associated with self efficacy of physical activity in breast cancer survivors

Yunxi Zhang Ye Lin George Baum Karen M. Basen-Engquist Michael D. Swartz 《统计学通讯:模拟与计算》2013,42(8):2523-2537

ABSTRACT

Missing data are commonly encountered in self-reported measurements and questionnaires. It is crucial to treat missing values using appropriate method to avoid bias and reduction of power. Various types of imputation methods exist, but it is not always clear which method is preferred for imputation of data with non-normal variables. In this paper, we compared four imputation methods: mean imputation, quantile imputation, multiple imputation, and quantile regression multiple imputation (QRMI), using both simulated and real data investigating factors affecting self-efficacy in breast cancer survivors. The results displayed an advantage of using multiple imputation, especially QRMI when data are not normal. 相似文献

14.

Inferences on missing information under multiple imputation and two-stage multiple imputation

Ofer Harel 《Statistical Methodology》2007,4(1):75-89

In the presence of missing values, researchers may be interested in the rates of missing information. The rates of missing information are (a) important for assessing how the missing information contributes to inferential uncertainty about, Q, the population quantity of interest, (b) are an important component in the decision of the number of imputations, and (c) can be used to test model uncertainty and model fitting. In this article I will derive the asymptotic distribution of the rates of missing information in two scenarios: the conventional multiple imputation (MI), and the two-stage MI. Numerically I will show that the proposed asymptotic distribution agrees with the simulated one. I will also suggest the number of imputations needed to obtain reliable missing information rate estimates for each method, based on the asymptotic distribution. 相似文献

15.

Missing data sensitivity analysis for recurrent event data using controlled imputation

Oliver N. Keene James H. Roger Benjamin F. Hartley Michael G. Kenward 《Pharmaceutical statistics》2014,13(4):258-264

Statistical analyses of recurrent event data have typically been based on the missing at random assumption. One implication of this is that, if data are collected only when patients are on their randomized treatment, the resulting de jure estimator of treatment effect corresponds to the situation in which the patients adhere to this regime throughout the study. For confirmatory analysis of clinical trials, sensitivity analyses are required to investigate alternative de facto estimands that depart from this assumption. Recent publications have described the use of multiple imputation methods based on pattern mixture models for continuous outcomes, where imputation for the missing data for one treatment arm (e.g. the active arm) is based on the statistical behaviour of outcomes in another arm (e.g. the placebo arm). This has been referred to as controlled imputation or reference‐based imputation. In this paper, we use the negative multinomial distribution to apply this approach to analyses of recurrent events and other similar outcomes. The methods are illustrated by a trial in severe asthma where the primary endpoint was rate of exacerbations and the primary analysis was based on the negative binomial model. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

16.

Imputation for missing values and corresponding variance estimation

R. R. Sitter J. N. K. Rao 《Revue canadienne de statistique》1997,25(1):61-73

Imputation is commonly used to compensate for missing data in surveys. We consider the general case where the responses on either the variable of interest y or the auxiliary variable x or both may be missing. We use ratio imputation for y when the associated x is observed and different imputations when x is not observed. We obtain design-consistent linearization and jackknife variance estimators under uniform response. We also report the results of a simulation study on the efficiencies of imputed estimators, and relative biases and efficiencies of associated variance estimators. 相似文献

17.

Multiple imputation using multivariate gh transformations

Yulei He Trivellore E. Raghunathan 《Journal of applied statistics》2012,39(10):2177-2198

Multiple imputation has emerged as a popular approach to handling data sets with missing values. For incomplete continuous variables, imputations are usually produced using multivariate normal models. However, this approach might be problematic for variables with a strong non-normal shape, as it would generate imputations incoherent with actual distributions and thus lead to incorrect inferences. For non-normal data, we consider a multivariate extension of Tukey's gh distribution/transformation [38] to accommodate skewness and/or kurtosis and capture the correlation among the variables. We propose an algorithm to fit the incomplete data with the model and generate imputations. We apply the method to a national data set for hospital performance on several standard quality measures, which are highly skewed to the left and substantially correlated with each other. We use Monte Carlo studies to assess the performance of the proposed approach. We discuss possible generalizations and give some advices to practitioners on how to handle non-normal incomplete data. 相似文献

18.

Bootstrap methods for imputed data from regression,ratio and hot‐deck imputation

Zeinab Mashreghi Christian Léger David Haziza 《Revue canadienne de statistique》2014,42(1):142-167

相似文献

19.

Empirical likelihood for linear regression models under imputation for missing responses

Qihua Wang J. N. K. Rao 《Revue canadienne de statistique》2001,29(4):597-608

The authors study the empirical likelihood method for linear regression models. They show that when missing responses are imputed using least squares predictors, the empirical log‐likelihood ratio is asymptotically a weighted sum of chi‐square variables with unknown weights. They obtain an adjusted empirical log‐likelihood ratio which is asymptotically standard chi‐square and hence can be used to construct confidence regions. They also obtain a bootstrap empirical log‐likelihood ratio and use its distribution to approximate that of the empirical log‐likelihood ratio. A simulation study indicates that the proposed methods are comparable in terms of coverage probabilities and average lengths of confidence intervals, and perform better than a normal approximation based method. 相似文献

20.

Comparison of several imputation methods for missing baseline data in propensity scores analysis of binary outcome

Brenda J. Crowe Ilya A. Lipkovich Ouhong Wang 《Pharmaceutical statistics》2010,9(4):269-279

We performed a simulation study comparing the statistical properties of the estimated log odds ratio from propensity scores analyses of a binary response variable, in which missing baseline data had been imputed using a simple imputation scheme (Treatment Mean Imputation), compared with three ways of performing multiple imputation (MI) and with a Complete Case analysis. MI that included treatment (treated/untreated) and outcome (for our analyses, outcome was adverse event [yes/no]) in the imputer's model had the best statistical properties of the imputation schemes we studied. MI is feasible to use in situations where one has just a few outcomes to analyze. We also found that Treatment Mean Imputation performed quite well and is a reasonable alternative to MI in situations where it is not feasible to use MI. Treatment Mean Imputation performed better than MI methods that did not include both the treatment and outcome in the imputer's model. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献