首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Imputation methods that assign a selection of respondents’ values for missing i tern nonresponses give rise to an addd,tional source of sampling variation, which we term imputation varLance , We examine the effect of imputation variance on the precision of the mean, and propose four procedures for sampling the rEespondents that reduce this additional variance. Two of the procedures employ improved sample designs through selection of respc,ndents by sampling without replacement and by stratified sampl;lng. The other two increase the sample base by the use of multiple imputations.  相似文献   

2.
It is cleared in recent researches that the raising of missing values in datasets is inevitable. Imputation of missing data is one of the several methods which have been introduced to overcome this issue. Imputation techniques are trying to answer the case of missing data by covering missing values with reasonable estimates permanently. There are a lot of benefits for these procedures rather than their drawbacks. The operation of these methods has not been clarified, which means that they provide mistrust among analytical results. One approach to evaluate the outcomes of the imputation process is estimating uncertainty in the imputed data. Nonparametric methods are appropriate to estimating the uncertainty when data are not followed by any particular distribution. This paper deals with a nonparametric method for estimation and testing the significance of the imputation uncertainty, which is based on Wilcoxon test statistic, and which could be employed for estimating the precision of the imputed values created by imputation methods. This proposed procedure could be employed to judge the possibility of the imputation process for datasets, and to evaluate the influence of proper imputation methods when they are utilized to the same dataset. This proposed approach has been compared with other nonparametric resampling methods, including bootstrap and jackknife to estimate uncertainty in the imputed data under the Bayesian bootstrap imputation method. The ideas supporting the proposed method are clarified in detail, and a simulation study, which indicates how the approach has been employed in practical situations, is illustrated.  相似文献   

3.
Summary: This paper deals with item nonresponse on income questions in panel surveys and with longitudinal and cross–sectional imputation strategies to cope with this phenomenon. Using data from the German SOEP, we compare income inequality and mobility indicators based only on truly observed information to those derived from observed and imputed observations. First, we find a positive correlation between inequality and imputation. Secondly, income mobility appears to be significantly understated using observed information only. Finally, longitudinal analyses provide evidence for a positive inter–temporal correlation between item nonresponse and any kind of subsequent nonresponse.* We are grateful to two anonymous referees and to Jan Goebel for very helpful comments and suggestions on an earlier draft of this paper. The paper also benefited from discussions with seminar participants at the Workshop on Item Nonresponse and Data Quality in Large Social Surveys, Basel/CH, October 9–11, 2003.  相似文献   

4.
5.
In the presence of missing values, researchers may be interested in the rates of missing information. The rates of missing information are (a) important for assessing how the missing information contributes to inferential uncertainty about, Q, the population quantity of interest, (b) are an important component in the decision of the number of imputations, and (c) can be used to test model uncertainty and model fitting. In this article I will derive the asymptotic distribution of the rates of missing information in two scenarios: the conventional multiple imputation (MI), and the two-stage MI. Numerically I will show that the proposed asymptotic distribution agrees with the simulated one. I will also suggest the number of imputations needed to obtain reliable missing information rate estimates for each method, based on the asymptotic distribution.  相似文献   

6.
Multiple imputation has emerged as a popular approach to handling data sets with missing values. For incomplete continuous variables, imputations are usually produced using multivariate normal models. However, this approach might be problematic for variables with a strong non-normal shape, as it would generate imputations incoherent with actual distributions and thus lead to incorrect inferences. For non-normal data, we consider a multivariate extension of Tukey's gh distribution/transformation [38] to accommodate skewness and/or kurtosis and capture the correlation among the variables. We propose an algorithm to fit the incomplete data with the model and generate imputations. We apply the method to a national data set for hospital performance on several standard quality measures, which are highly skewed to the left and substantially correlated with each other. We use Monte Carlo studies to assess the performance of the proposed approach. We discuss possible generalizations and give some advices to practitioners on how to handle non-normal incomplete data.  相似文献   

7.
We investigate by simulation how the wild bootstrap and pairs bootstrap perform in t and F tests of regression parameters in the stochastic regression model, where explanatory variables are stochastic and not given and there exists no heteroskedasticity. The wild bootstrap procedure due to Davidson and Flachaire [The wild bootstrap, tamed at last, Working paper, IER#1000, Queen's University, 2001] with restricted residuals works best but its dominance is not strong compared to the result of Flachaire [Bootstrapping heteroskedastic regression models: wild bootstrap vs. pairs bootstrap, Comput. Statist. Data Anal. 49 (2005), pp. 361–376] in the fixed regression model where explanatory variables are fixed and there exists heteroskedasticity.  相似文献   

8.
In clinical trials, missing data commonly arise through nonadherence to the randomized treatment or to study procedure. For trials in which recurrent event endpoints are of interests, conventional analyses using the proportional intensity model or the count model assume that the data are missing at random, which cannot be tested using the observed data alone. Thus, sensitivity analyses are recommended. We implement the control‐based multiple imputation as sensitivity analyses for the recurrent event data. We model the recurrent event using a piecewise exponential proportional intensity model with frailty and sample the parameters from the posterior distribution. We impute the number of events after dropped out and correct the variance estimation using a bootstrap procedure. We apply the method to an application of sitagliptin study.  相似文献   

9.
ABSTRACT

The bootstrap is typically less reliable in the context of time-series models with serial correlation of unknown form than when regularity conditions for the conventional IID bootstrap apply. It is, therefore, useful to have diagnostic techniques capable of evaluating bootstrap performance in specific cases. Those suggested in this paper are closely related to the fast double bootstrap (FDB) and are not computationally intensive. They can also be used to gauge the performance of the FDB itself. Examples of bootstrapping time series are presented, which illustrate the diagnostic procedures, and show how the results can cast light on bootstrap performance.  相似文献   

10.
Autoregressive models are widely employed for predictions and other inferences in many scientific fields. While the determination of their order is in general a difficult and critical step, this task becomes more complicated and crucial when the time series under investigation is realization of a stochastic process characterized by sparsity. In this paper we present a method for order determination of a stationary AR model with a sparse structure, given a set of observations, based upon a bootstrapped version of MAICE procedure [Akaike H. Prediction and entropy. Springer; 1998], in conjunction with a LASSO-type constraining procedure for lag suppression of insignificant lags. Empirical results will be obtained via Monte Carlo simulations. The quality of our method is assessed by comparison with the commonly adopted cross-validation approach and the non bootstrap counterpart of the presented procedure.  相似文献   

11.
In this study, we propose sufficient time series bootstrap methods that achieve better results than conventional non-overlapping block bootstrap, but with less computing time and lower standard errors of estimation. Also, we propose using a new technique using ordered bootstrapped blocks, to better preserve the dependency structure of the original data. The performance of the proposed methods are compared in a simulation study for MA(2) and AR(2) processes and in an example. The results show that our methods are good competitors that often exhibit improved performance over the conventional block methods.  相似文献   

12.
Multiple imputation (MI) is an increasingly popular method for analysing incomplete multivariate data sets. One of the most crucial assumptions of this method relates to mechanism leading to missing data. Distinctness is typically assumed, which indicates a complete independence of mechanisms underlying missingness and data generation. In addition, missing at random or missing completely at random is assumed, which explicitly states under which conditions missingness is independent of observed data. Despite common use of MI under these assumptions, plausibility and sensitivity to these fundamental assumptions have not been well-investigated. In this work, we investigate the impact of non-distinctness and non-ignorability. In particular, non-ignorability is due to unobservable cluster-specific effects (e.g. random-effects). Through a comprehensive simulation study, we show that MI inferences suggest that nonignoriability due to non-distinctness do not immediately imply dismal performance while non-ignorability due to missing not at random leads to quite subpar performance.  相似文献   

13.
In this article, we consider the two-factor unbalanced nested design model without the assumption of equal error variance. For the problem of testing ‘main effects’ of both factors, we propose a parametric bootstrap (PB) approach and compare it with the existing generalized F (GF) test. The Type I error rates of the tests are evaluated using Monte Carlo simulation. Our studies show that the PB test performs better than the GF test. The PB test performs very satisfactorily even for small samples while the GF test exhibit poor Type I error properties when the number of factorial combinations or treatments goes up. It is also noted that the same tests can be used to test the significance of the random effect variance component in a two-factor mixed effects nested model under unequal error variances.  相似文献   

14.
15.
Zhuqing Yu 《Statistics》2017,51(2):277-293
It has been found, under a smooth function model setting, that the n out of n bootstrap is inconsistent at stationary points of the smooth function, but that the m out of n bootstrap is consistent, provided that a correct convergence rate is specified of the plug-in smooth function estimator. By considering a more general moving-parameter framework, we show that neither of the above bootstrap methods is consistent uniformly over neighbourhoods of stationary points, so that anomalies often arise of coverages of bootstrap sets over certain subsets of parameter values. We propose a recentred bootstrap procedure for constructing confidence sets with uniformly correct coverages over compact sets containing stationary points. A weighted bootstrap procedure is also proposed as an alternative under more general circumstances. Unlike the m out of n bootstrap, both procedures do not require knowledge of the convergence rate of the smooth function estimator. Empirical performance of our procedures is illustrated with numerical examples.  相似文献   

16.
Multiple imputation (MI) is an appealing option for handling missing data. When implementing MI, however, users need to make important decisions to obtain estimates with good statistical properties. One such decision involves the choice of imputation model–the joint modeling (JM) versus fully conditional specification (FCS) approach. Another involves the choice of method to handle interactions. These include imputing the interaction term as any other variable (active imputation), or imputing the main effects and then deriving the interaction (passive imputation). Our study investigates the best approach to perform MI in the presence of interaction effects involving two categorical variables. Such effects warrant special attention as they involve multiple correlated parameters that are handled differently under JM and FCS modeling. Through an extensive simulation study, we compared active, passive and an improved passive approach under FCS, as JM precludes passive imputation. We additionally compared JM and FCS techniques using active imputation. Performance between active and passive imputation was comparable. The improved passive approach proved superior to the other two particularly when the number of parameters corresponding to the interaction was large. JM without rounding and FCS using active imputation were also mostly comparable, with JM outperforming FCS when the number of parameters was large. In a direct comparison of JM active and FCS improved passive, the latter was the clear winner. We recommend improved passive imputation under FCS along with sensitivity analyses to handle multi-level interaction terms.  相似文献   

17.
The finite sample moments of the bootstrap estimator of the James-Stein rule are derived and shown to be biased. Analytical results shed some light upon the source of bias and suggest that the bootstrap will be biased in other settings where the moments of the statistic of interest depends on nonlinear functions of the parameters of its distribution.  相似文献   

18.
The finite sample moments of the bootstrap estimator of the James-Stein rule are derived and shown to be biased. Analytical results shed some light upon the source of bias and suggest that the bootstrap will be biased in other settings where the moments of the statistic of interest depends on nonlinear functions of the parameters of its distribution.  相似文献   

19.
It is widely known that bootstrap failure can often be remedied by using a technique known as the ' m out of n ' bootstrap, by which a smaller number, m say, of observations are resampled from the original sample of size n . In successful cases of the bootstrap, the m out of n bootstrap is often deemed unnecessary. We show that the problem of constructing nonparametric confidence intervals is an exceptional case. By considering a new class of m out of n bootstrap confidence limits, we develop a computationally efficient approach based on the double bootstrap to construct the optimal m out of n bootstrap intervals. We show that the optimal intervals have a coverage accuracy which is comparable with that of the classical double-bootstrap intervals, and we conduct a simulation study to examine their performance. The results are in general very encouraging. Alternative approaches which yield even higher order accuracy are also discussed.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号