首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Imputation is often used in surveys to treat item nonresponse. It is well known that treating the imputed values as observed values may lead to substantial underestimation of the variance of the point estimators. To overcome the problem, a number of variance estimation methods have been proposed in the literature, including resampling methods such as the jackknife and the bootstrap. In this paper, we consider the problem of doubly robust inference in the presence of imputed survey data. In the doubly robust literature, point estimation has been the main focus. In this paper, using the reverse framework for variance estimation, we derive doubly robust linearization variance estimators in the case of deterministic and random regression imputation within imputation classes. Also, we study the properties of several jackknife variance estimators under both negligible and nonnegligible sampling fractions. A limited simulation study investigates the performance of various variance estimators in terms of relative bias and relative stability. Finally, the asymptotic normality of imputed estimators is established for stratified multistage designs under both deterministic and random regression imputation. The Canadian Journal of Statistics 40: 259–281; 2012 © 2012 Statistical Society of Canada  相似文献   

2.
It is cleared in recent researches that the raising of missing values in datasets is inevitable. Imputation of missing data is one of the several methods which have been introduced to overcome this issue. Imputation techniques are trying to answer the case of missing data by covering missing values with reasonable estimates permanently. There are a lot of benefits for these procedures rather than their drawbacks. The operation of these methods has not been clarified, which means that they provide mistrust among analytical results. One approach to evaluate the outcomes of the imputation process is estimating uncertainty in the imputed data. Nonparametric methods are appropriate to estimating the uncertainty when data are not followed by any particular distribution. This paper deals with a nonparametric method for estimation and testing the significance of the imputation uncertainty, which is based on Wilcoxon test statistic, and which could be employed for estimating the precision of the imputed values created by imputation methods. This proposed procedure could be employed to judge the possibility of the imputation process for datasets, and to evaluate the influence of proper imputation methods when they are utilized to the same dataset. This proposed approach has been compared with other nonparametric resampling methods, including bootstrap and jackknife to estimate uncertainty in the imputed data under the Bayesian bootstrap imputation method. The ideas supporting the proposed method are clarified in detail, and a simulation study, which indicates how the approach has been employed in practical situations, is illustrated.  相似文献   

3.
ON BOOTSTRAP HYPOTHESIS TESTING   总被引:2,自引:0,他引:2  
We describe methods for constructing bootstrap hypothesis tests, illustrating our approach using analysis of variance. The importance of pivotalness is discussed. Pivotal statistics usually result in improved accuracy of level. We note that hypothesis tests and confidence intervals call for different methods of resampling, so as to ensure that accurate critical point estimates are obtained in the former case even when data fail to comply with the null hypothesis. Our main points are illustrated by a simulation study and application to three real data sets.  相似文献   

4.
Alternative methods of estimating properties of unknown distributions include the bootstrap and the smoothed bootstrap. In the standard bootstrap setting, Johns (1988) introduced an importance resam¬pling procedure that results in more accurate approximation to the bootstrap estimate of a distribution function or a quantile. With a suitable “exponential tilting” similar to that used by Johns, we derived a smoothed version of importance resampling in the framework of the smoothed bootstrap. Smoothed importance resampling procedures were developed for the estimation of distribution functions of the Studentized mean, the Studentized variance, and the correlation coefficient. Implementation of these procedures are presented via simulation results which concentrate on the problem of estimation of distribution functions of the Studentized mean and Studentized variance for different sample sizes and various pre-specified smoothing bandwidths for the normal data; additional simulations were conducted for the estimation of quantiles of the distribution of the Studentized mean under an optimal smoothing bandwidth when the original data were simulated from three different parent populations: lognormal, t(3) and t(10). These results suggest that in cases where it is advantageous to use the smoothed bootstrap rather than the standard bootstrap, the amount of resampling necessary might be substantially reduced by the use of importance resampling methods and the efficiency gains depend on the bandwidth used in the kernel density estimation.  相似文献   

5.
The usual covariance estimates for data n-1 from a stationary zero-mean stochastic process {Xt} are the sample covariances Both direct and resampling approaches are used to estimate the variance of the sample covariances. This paper compares the performance of these variance estimates. Using a direct approach, we show that a consistent windowed periodogram estimate for the spectrum is more effective than using the periodogram itself. A frequency domain bootstrap for time series is proposed and analyzed, and we introduce a frequency domain version of the jackknife that is shown to be asymptotically unbiased and consistent for Gaussian processes. Monte Carlo techniques show that the time domain jackknife and subseries method cannot be recommended. For a Gaussian underlying series a direct approach using a smoothed periodogram is best; for a non-Gaussian series the frequency domain bootstrap appears preferable. For small samples, the bootstraps are dangerous: both the direct approach and frequency domain jackknife are better.  相似文献   

6.
Exact confidence intervals for variances rely on normal distribution assumptions. Alternatively, large-sample confidence intervals for the variance can be attained if one estimates the kurtosis of the underlying distribution. The method used to estimate the kurtosis has a direct impact on the performance of the interval and thus the quality of statistical inferences. In this paper the author considers a number of kurtosis estimators combined with large-sample theory to construct approximate confidence intervals for the variance. In addition, a nonparametric bootstrap resampling procedure is used to build bootstrap confidence intervals for the variance. Simulated coverage probabilities using different confidence interval methods are computed for a variety of sample sizes and distributions. A modification to a conventional estimator of the kurtosis, in conjunction with adjustments to the mean and variance of the asymptotic distribution of a function of the sample variance, improves the resulting coverage values for leptokurtically distributed populations.  相似文献   

7.
Variance estimators for probability sample-based predictions of species richness (S) are typically conditional on the sample (expected variance). In practical applications, sample sizes are typically small, and the variance of input parameters to a richness estimator should not be ignored. We propose a modified bootstrap variance estimator that attempts to capture the sampling variance by generating B replications of the richness prediction from stochastically resampled data of species incidence. The variance estimator is demonstrated for the observed richness (SO), five richness estimators, and with simulated cluster sampling (without replacement) in 11 finite populations of forest tree species. A key feature of the bootstrap procedure is a probabilistic augmentation of a species incidence matrix by the number of species expected to be ‘lost’ in a conventional bootstrap resampling scheme. In Monte-Carlo (MC) simulations, the modified bootstrap procedure performed well in terms of tracking the average MC estimates of richness and standard errors. Bootstrap-based estimates of standard errors were as a rule conservative. Extensions to other sampling designs, estimators of species richness and diversity, and estimates of change are possible.  相似文献   

8.
Various methods have been suggested in the literature to handle a missing covariate in the presence of surrogate covariates. These methods belong to one of two paradigms. In the imputation paradigm, Pepe and Fleming (1991) and Reilly and Pepe (1995) suggested filling in missing covariates using the empirical distribution of the covariate obtained from the observed data. We can proceed one step further by imputing the missing covariate using nonparametric maximum likelihood estimates (NPMLE) of the density of the covariate. Recently Murphy and Van der Vaart (1998a) showed that such an approach yields a consistent, asymptotically normal, and semiparametric efficient estimate for the logistic regression coefficient. In the weighting paradigm, Zhao and Lipsitz (1992) suggested an estimating function using completely observed records after weighting inversely by the probability of observation. An extension of this weighting approach designed to achieve semiparametric efficient bound is considered by Robins, Hsieh and Newey (RHN) (1995). The two ends of each paradigm (NPMLE and RHN) attain the efficiency bound and are asymptotically equivalent. However, both require a substantial amount of computation. A question arises whether and when, in practical situations, this extensive computation is worthwhile. In this paper we investigate the performance of single and multiple imputation estimates, weighting estimates, semiparametric efficient estimates, and two new imputation estimates. Simulation studies suggest that the sample size should be substantially large (e.g. n=2000) for NPMLE and RHN to be more efficient than simpler imputation estimates. When the sample size is moderately large (n≤ 1500), simpler imputation estimates have as small a variance as semiparametric efficient estimates.  相似文献   

9.
In this paper, we focus on resampling non-stationary weakly dependent point processes in two dimensions to make inference on the inhomogeneous K function ( Baddeley et al., 2000). We provide theoretical results that show a consistency result of the bootstrap estimates of the variance as the observation region and resampling blocks increase in size. We present results of a simulation study that examines the performance of nominal 95% confidence intervals for the inhomogeneous K function obtained via our bootstrap procedure. The procedure is also applied to a rainforest dataset.  相似文献   

10.
Recent developments in sample survey theory include the following topics: foundational aspects of inference, resampling methods for variance and confidence interval estimation, imputation for nonresponse and analysis of complex survey data. An overview and appraisal of some of these developments are presented.  相似文献   

11.
Importance resampling is an approach that uses exponential tilting to reduce the resampling necessary for the construction of nonparametric bootstrap confidence intervals. The properties of bootstrap importance confidence intervals are well established when the data is a smooth function of means and when there is no censoring. However, in the framework of survival or time-to-event data, the asymptotic properties of importance resampling have not been rigorously studied, mainly because of the unduly complicated theory incurred when data is censored. This paper uses extensive simulation to show that, for parameter estimates arising from fitting Cox proportional hazards models, importance bootstrap confidence intervals can be constructed if the importance resampling probabilities of the records for the n individuals in the study are determined by the empirical influence function for the parameter of interest. Our results show that, compared to uniform resampling, importance resampling improves the relative mean-squared-error (MSE) efficiency by a factor of nine (for n = 200). The efficiency increases significantly with sample size, is mildly associated with the amount of censoring, but decreases slightly as the number of bootstrap resamples increases. The extra CPU time requirement for calculating importance resamples is negligible when compared to the large improvement in MSE efficiency. The method is illustrated through an application to data on chronic lymphocytic leukemia, which highlights that the bootstrap confidence interval is the preferred alternative to large sample inferences when the distribution of a specific covariate deviates from normality. Our results imply that, because of its computational efficiency, importance resampling is recommended whenever bootstrap methodology is implemented in a survival framework. Its use is particularly important when complex covariates are involved or the survival problem to be solved is part of a larger problem; for instance, when determining confidence bounds for models linking survival time with clusters identified in gene expression microarray data.  相似文献   

12.
Quasi-random sequences are known to give efficient numerical integration rules in many Bayesian statistical problems where the posterior distribution can be transformed into periodic functions on then-dimensional hypercube. From this idea we develop a quasi-random approach to the generation of resamples used for Monte Carlo approximations to bootstrap estimates of bias, variance and distribution functions. We demonstrate a major difference between quasi-random bootstrap resamples, which are generated by deterministic algorithms and have no true randomness, and the usual pseudo-random bootstrap resamples generated by the classical bootstrap approach. Various quasi-random approaches are considered and are shown via a simulation study to result in approximants that are competitive in terms of efficiency when compared with other bootstrap Monte Carlo procedures such as balanced and antithetic resampling.  相似文献   

13.
Variance estimation under systematic sampling with probability proportional to size is known to be a difficult problem. We attempt to tackle this problem by the bootstrap resampling method. It is shown that the usual way to bootstrap fails to give satisfactory variance estimates. As a remedy, we propose a double bootstrap method which is based on certain working models and involves two levels of resampling. Unlike existing methods which deal exclusively with the Horvitz–Thompson estimator, the double bootstrap method can be used to estimate the variance of any statistic. We illustrate this within the context of both mean and median estimation. Empirical results based on five natural populations are encouraging.  相似文献   

14.
In practical survey sampling, missing data are unavoidable due to nonresponse, rejected observations by editing, disclosure control, or outlier suppression. We propose a calibrated imputation approach so that valid point and variance estimates of the population (or domain) totals can be computed by the secondary users using simple complete‐sample formulae. This is especially helpful for variance estimation, which generally require additional information and tools that are unavailable to the secondary users. Our approach is natural for continuous variables, where the estimation may be either based on reweighting or imputation, including possibly their outlier‐robust extensions. We also propose a multivariate procedure to accommodate the estimation of the covariance matrix between estimated population totals, which facilitates variance estimation of the ratios or differences among the estimated totals. We illustrate the proposed approach using simulation data in supplementary materials that are available online.  相似文献   

15.
For m–dependent, identically distributed random observation, the bootstrap method provides inconsistent estimators of the distribution and variance of the sample mean. This paper proposes an alternative resampling procedure. For estimating the distribution and variance of a function of the sample mean, the proposed resampling estimators are shown to be strongly consistent.  相似文献   

16.
Some studies of the bootstrap have assessed the effect of smoothing the estimated distribution that is resampled, a process usually known as the smoothed bootstrap. Generally, the smoothed distribution for resampling is a kernel estimate and is often rescaled to retain certain characteristics of the empirical distribution. Typically the effect of such smoothing has been measured in terms of the mean-squared error of bootstrap point estimates. The reports of these previous investigations have not been encouraging about the efficacy of smoothing. In this paper the effect of resampling a kernel-smoothed distribution is evaluated through expansions for the coverage of bootstrap percentile confidence intervals. It is shown that, under the smooth function model, proper bandwidth selection can accomplish a first-order correction for the one-sided percentile method. With the objective of reducing the coverage error the appropriate bandwidth for one-sided intervals converges at a rate of n −1/4, rather than the familiar n −1/5 for kernel density estimation. Applications of this same approach to bootstrap t and two-sided intervals yield optimal bandwidths of order n −1/2. These bandwidths depend on moments of the smooth function model and not on derivatives of the underlying density of the data. The relationship of this smoothing method to both the accelerated bias correction and the bootstrap t methods provides some insight into the connections between three quite distinct approximate confidence intervals.  相似文献   

17.
The area under the Receiver Operating Characteristic (ROC) curve (AUC) and related summary indices are widely used for assessment of accuracy of an individual and comparison of performances of several diagnostic systems in many areas including studies of human perception, decision making, and the regulatory approval process for new diagnostic technologies. Many investigators have suggested implementing the bootstrap approach to estimate variability of AUC-based indices. Corresponding bootstrap quantities are typically estimated by sampling a bootstrap distribution. Such a process, frequently termed Monte Carlo bootstrap, is often computationally burdensome and imposes an additional sampling error on the resulting estimates. In this article, we demonstrate that the exact or ideal (sampling error free) bootstrap variances of the nonparametric estimator of AUC can be computed directly, i.e., avoiding resampling of the original data, and we develop easy-to-use formulas to compute them. We derive the formulas for the variances of the AUC corresponding to a single given or random reader, and to the average over several given or randomly selected readers. The derived formulas provide an algorithm for computing the ideal bootstrap variances exactly and hence improve many bootstrap methods proposed earlier for analyzing AUCs by eliminating the sampling error and sometimes burdensome computations associated with a Monte Carlo (MC) approximation. In addition, the availability of closed-form solutions provides the potential for an analytical assessment of the properties of bootstrap variance estimators. Applications of the proposed method are shown on two experimentally ascertained datasets that illustrate settings commonly encountered in diagnostic imaging. In the context of the two examples we also demonstrate the magnitude of the effect of the sampling error of the MC estimators on the resulting inferences.  相似文献   

18.
The authors propose a bootstrap procedure which estimates the distribution of an estimating function by resampling its terms using bootstrap techniques. Studentized versions of this so‐called estimating function (EF) bootstrap yield methods which are invariant under reparametrizations. This approach often has substantial advantage, both in computation and accuracy, over more traditional bootstrap methods and it applies to a wide class of practical problems where the data are independent but not necessarily identically distributed. The methods allow for simultaneous estimation of vector parameters and their components. The authors use simulations to compare the EF bootstrap with competing methods in several examples including the common means problem and nonlinear regression. They also prove symptotic results showing that the studentized EF bootstrap yields higher order approximations for the whole vector parameter in a wide class of problems.  相似文献   

19.
Standard algorithms for the construction of iterated bootstrap confidence intervals are computationally very demanding, requiring nested levels of bootstrap resampling. We propose an alternative approach to constructing double bootstrap confidence intervals that involves replacing the inner level of resampling by an analytical approximation. This approximation is based on saddlepoint methods and a tail probability approximation of DiCiccio and Martin (1991). Our technique significantly reduces the computational expense of iterated bootstrap calculations. A formal algorithm for the construction of our approximate iterated bootstrap confidence intervals is presented, and some crucial practical issues arising in its implementation are discussed. Our procedure is illustrated in the case of constructing confidence intervals for ratios of means using both real and simulated data. We repeat an experiment of Schenker (1985) involving the construction of bootstrap confidence intervals for a variance and demonstrate that our technique makes feasible the construction of accurate bootstrap confidence intervals in that context. Finally, we investigate the use of our technique in a more complex setting, that of constructing confidence intervals for a correlation coefficient.  相似文献   

20.
Bootstrap methods are proposed for estimating sampling distributions and associated statistics for regression parameters in multivariate survival data. We use an Independence Working Model (IWM) approach, fitting margins independently, to obtain consistent estimates of the parameters in the marginal models. Resampling procedures, however, are applied to an appropriate joint distribution to estimate covariance matrices, make bias corrections, and construct confidence intervals. The proposed methods allow for fixed or random explanatory variables, the latter case using extensions of existing resampling schemes (Loughin,1995), and they permit the possibility of random censoring. An application is shown for the viral positivity time data previously analyzed by Wei, Lin, and Weissfeld (1989). A simulation study of small-sample properties shows that the proposed bootstrap procedures provide substantial improvements in variance estimation over the robust variance estimator commonly used with the IWM. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号