首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Various bootstrap methods for variance estimation and confidence intervals in complex survey data, where sampling is done without replacement, have been proposed in the literature. The oldest, and perhaps the most intuitively appealing, is the without-replacement bootstrap (BWO) method proposed by Gross (1980). Unfortunately, the BWO method is only applicable to very simple sampling situations. We first introduce extensions of the BWO method to more complex sampling designs. The performance of the BWO and two other bootstrap methods, the rescaling bootstrap (Rao and Wu 1988) and the mirror-match bootstrap (Sitter 1992), are then compared through a simulation study. Together these three methods encompass the various bootstrap proposals.  相似文献   

2.
Marginal imputation, that consists of imputing items separately, generally leads to biased estimators of bivariate parameters such as finite population coefficients of correlation. To overcome this problem, two main approaches have been considered in the literature: the first consists of using customary imputation methods such as random hot‐deck imputation and adjusting for the bias at the estimation stage. This approach was studied in Skinner & Rao 2002 . In this paper, we extend the results of Skinner & Rao 2002 to the case of arbitrary sampling designs and three variants of random hot‐deck imputation. The second approach consists of using an imputation method, which preserves the relationship between variables. Shao & Wang 2002 proposed a joint random regression imputation procedure that succeeds in preserving the relationships between two study variables. One drawback of the Shao–Wang procedure is that it suffers from an additional variability (called the imputation variance) due to the random selection of residuals, resulting in potentially inefficient estimators. Following Chauvet, Deville, & Haziza 2011 , we propose a fully efficient version of the Shao–Wang procedure that preserves the relationship between two study variables, while virtually eliminating the imputation variance. Results of a simulation study support our findings. An application using data from the Workplace and Employees Survey is also presented. The Canadian Journal of Statistics 40: 124–149; 2012 © 2011 Statistical Society of Canada  相似文献   

3.
In this paper, bias-adjustment in the jackknife estimator of variance accredited to Rao and Sitter (1995) has been considered. Then the bias-adjusted Rao and Sitter (1995) estimator has been calibrated such that its expected value under the imputing superpopulation model remains the same as the expected value of the mean squared error of the ratio estimator in the presence of non-response. A simulation study has been performed to compare the six different estimators of variance: out of them four estimators belong to Rao and Sitter (1995) and the other two proposed estimators are named as bias-adjusted and bias-adjusted-cum-calibrated estimators. The empirical relative bias and empirical relative efficiency of the two proposed estimators with respect to the four existing estimators accredited to Rao and Sitter (1995) have been investigated through simulations. The bias-adjusted-cum-calibrated estimator has been found to be an efficient estimator in the case of heteroscadastic populations. The present paper considers the situation of simple random and without replacement sampling. The possibility of obtaining a negative estimate of variance by the estimator due to Kim et al. (2006) has been pointed out.  相似文献   

4.
The present investigation addresses the problem of estimating a finite population mean in two-phase cluster sampling in presence of random non response situations. Utilizing information on an auxiliary variable, regression type estimators has been proposed. Effective imputation techniques have been suggested to deal with the random non response situations. The properties of the proposed estimation strategies have been studied for different cases of random non response situations in practical surveys. The superiority of the suggested methodology over the natural sample mean estimator of population mean has been established through empirical studies carried over the data sets of natural population and artificially generated population.  相似文献   

5.
Donor imputation is frequently used in surveys. However, very few variance estimation methods that take into account donor imputation have been developed in the literature. This is particularly true for surveys with high sampling fractions using nearest donor imputation, often called nearest‐neighbour imputation. In this paper, the authors develop a variance estimator for donor imputation based on the assumption that the imputed estimator of a domain total is approximately unbiased under an imputation model; that is, a model for the variable requiring imputation. Their variance estimator is valid, irrespective of the magnitude of the sampling fractions and the complexity of the donor imputation method, provided that the imputation model mean and variance are accurately estimated. They evaluate its performance in a simulation study and show that nonparametric estimation of the model mean and variance via smoothing splines brings robustness with respect to imputation model misspecifications. They also apply their variance estimator to real survey data when nearest‐neighbour imputation has been used to fill in the missing values. The Canadian Journal of Statistics 37: 400–416; 2009 © 2009 Statistical Society of Canada  相似文献   

6.
In studies of disease inheritance, it is more convenient to collect family data by first locating an affected individual and then enquiring about the status of his or her relatives. Although the different categories of children classified by disease, sex, and other covariates may have a particular multinomial distribution among families of a given size, the numbers as ascertained do not have the same distribution because of unequal probabilities of selection of families. The introduction of weighted distributions to correct for ascertainment bias in the estimation of parameters in the classical segregation model can be traced to Fisher in 1934. This theory was presented in a general formulation by C. R. Rao at the First International Symposium on Classical and Contagious Distributions in 1963. Further expansion on the topic was given by C. R. Rao in the ISI Centenary Volume published in 1985. The effects of different two-phase sampling designs on the estimation of parameters in the classical segregation model are examined. An approximation to the classical segregation likelihood model is found to produce results close to those of the exact likelihood function in Monte Carlo simulations for a balanced two-phase design. This has implications for more complex models in which the computation of the exact likelihood is prohibitive, such as for the enhancement of a typical survey sampling plan designed initially for linkage analysis but then used retroactively for a combined segregation and linkage analysis.  相似文献   

7.
Singh and Arnab (2010) presented a bias adjustment to the jackknife variance estimator of Rao and Sitter (1995) in the presence of non-response. In their paper, they obtained a second-order approximation of the bias of the Rao-Sitter variance estimator and then proposed a bias-adjusted estimator based on this approximation. To compare their proposed variance estimator to various other variance estimators, they performed a simulation study and showed that their variance estimator is superior to the Rao-Sitter variance estimator. In fact they showed that the Rao-Sitter variance estimator suffers from severe underestimation. These results contradict those in the literature, which indicate that the Rao-Sitter variance estimator suffers from a positive bias if the sampling fractions are not negligible; see Rao and Sitter (1995), Lee et al. (1995) and Haziza and Picard (2011). Because of this contradiction, we felt that a further investigation was warranted. In this paper, we attempt to recreate the results of Singh and Arnab (2010) and, in fact, show that their second order approximation to the bias of the Rao-Sitter variance estimator is incorrect and that their simulation results are also questionable.  相似文献   

8.
于力超  金勇进 《统计研究》2018,35(11):93-104
大规模抽样调查多采用复杂抽样设计,得到具有分层嵌套结构的调查数据集,其中不可避免会遇到数据缺失问题,针对分层结构含缺失数据集的插补策略目前鲜有研究。本文将Gibbs算法应用到分层含缺失数据集的多重插补过程中,分别研究了固定效应模型插补法和随机效应模型插补法,进而通过理论推导和数值模拟,在不同组内相关系数、群组规模、数据缺失比例等情形下,从参数估计结果的无偏性和有效性两方面,比较不同方法的插补效果,给出插补模型的选择建议。研究结果表明,采用随机效应模型作为插补模型时,得到的参数估计结果更准确,而固定效应模型作为插补模型操作相对简便,在数据缺失比例较小、组内相关系数较大、群组规模较大等情形下,可以采用固定效应插补模型,否则建议采用随机效应插补模型。  相似文献   

9.
The Hartley‐Rao‐Cochran sampling design is an unequal probability sampling design which can be used to select samples from finite populations. We propose to adjust the empirical likelihood approach for the Hartley‐Rao‐Cochran sampling design. The approach proposed intrinsically incorporates sampling weights, auxiliary information and allows for large sampling fractions. It can be used to construct confidence intervals. In a simulation study, we show that the coverage may be better for the empirical likelihood confidence interval than for standard confidence intervals based on variance estimates. The approach proposed is simple to implement and less computer intensive than bootstrap. The confidence interval proposed does not rely on re‐sampling, linearization, variance estimation, design‐effects or joint inclusion probabilities.  相似文献   

10.
Parameter estimation with missing data is a frequently encountered problem in statistics. Imputation is often used to facilitate the parameter estimation by simply applying the complete-sample estimators to the imputed dataset.In this article, we consider the problem of parameter estimation with nonignorable missing data using the approach of parametric fractional imputation proposed by Kim (2011). Using the fractional weights, the E-step of the EM algorithm can be approximated by the weighted mean of the imputed data likelihood where the fractional weights are computed from the current value of the parameter estimates. Calibration fractional imputation is also considered as a way for improving the Monte Carlo approximation in the fractional imputation. Variance estimation is also discussed. Results from two simulation studies are presented to compare the proposed method with the existing methods. A real data example from the Korea Labor and Income Panel Survey (KLIPS) is also presented.  相似文献   

11.
In modern scientific research, multiblock missing data emerges with synthesizing information across multiple studies. However, existing imputation methods for handling block-wise missing data either focus on the single-block missing pattern or heavily rely on the model structure. In this study, we propose a single regression-based imputation algorithm for multiblock missing data. First, we conduct a sparse precision matrix estimation based on the structure of block-wise missing data. Second, we impute the missing blocks with their means conditional on the observed blocks. Theoretical results about variable selection and estimation consistency are established in the context of a generalized linear model. Moreover, simulation studies show that compared with existing methods, the proposed imputation procedure is robust to various missing mechanisms because of the good properties of regression imputation. An application to Alzheimer's Disease Neuroimaging Initiative data also confirms the superiority of our proposed method.  相似文献   

12.
In practical survey sampling, missing data are unavoidable due to nonresponse, rejected observations by editing, disclosure control, or outlier suppression. We propose a calibrated imputation approach so that valid point and variance estimates of the population (or domain) totals can be computed by the secondary users using simple complete‐sample formulae. This is especially helpful for variance estimation, which generally require additional information and tools that are unavailable to the secondary users. Our approach is natural for continuous variables, where the estimation may be either based on reweighting or imputation, including possibly their outlier‐robust extensions. We also propose a multivariate procedure to accommodate the estimation of the covariance matrix between estimated population totals, which facilitates variance estimation of the ratios or differences among the estimated totals. We illustrate the proposed approach using simulation data in supplementary materials that are available online.  相似文献   

13.
Imputation is a much used method for handling missing data. It is appealing as it separates the missing data part of the analysis, which is handled by imputation, and the estimation part, which is handled by complete data methods. Most imputation methods, however, either rely on strict parametric assumptions or are rather ad hoc in which case they often only work approximately under even stricter assumptions. In this paper a non-parametric imputation method is proposed. Since it is non-parametric it works under quite general assumptions. In particular, a model for the complete data is not required in the imputation step, and the complete data method used after the imputation may be a general estimating equation for estimating a finite-dimensional parameter. Large sample results for the resulting estimator are given.  相似文献   

14.
Summary.  The paper develops a data augmentation method to estimate the distribution function of a variable, which is partially observed, under a non-ignorable missing data mechanism, and where surrogate data are available. An application to the estimation of hourly pay distributions using UK Labour Force Survey data provides the main motivation. In addition to considering a standard parametric data augmentation method, we consider the use of hot deck imputation methods as part of the data augmentation procedure to improve the robustness of the method. The method proposed is compared with standard methods that are based on an ignorable missing data mechanism, both in a simulation study and in the Labour Force Survey application. The focus is on reducing bias in point estimation, but variance estimation using multiple imputation is also considered briefly.  相似文献   

15.
In this article, we consider the problem of estimation of mode using two-phase sampling. Ratio- and difference-type estimators in two-phase sampling are proposed. The asymptotic properties of the proposed estimators are studied analytically as well as empirically for different situations for the given cost of surveys. The use of two-phase sampling has been found to be cost saving design while estimating mode by making a proper use of auxiliary information.  相似文献   

16.
Imputation is often used in surveys to treat item nonresponse. It is well known that treating the imputed values as observed values may lead to substantial underestimation of the variance of the point estimators. To overcome the problem, a number of variance estimation methods have been proposed in the literature, including resampling methods such as the jackknife and the bootstrap. In this paper, we consider the problem of doubly robust inference in the presence of imputed survey data. In the doubly robust literature, point estimation has been the main focus. In this paper, using the reverse framework for variance estimation, we derive doubly robust linearization variance estimators in the case of deterministic and random regression imputation within imputation classes. Also, we study the properties of several jackknife variance estimators under both negligible and nonnegligible sampling fractions. A limited simulation study investigates the performance of various variance estimators in terms of relative bias and relative stability. Finally, the asymptotic normality of imputed estimators is established for stratified multistage designs under both deterministic and random regression imputation. The Canadian Journal of Statistics 40: 259–281; 2012 © 2012 Statistical Society of Canada  相似文献   

17.
研究缺失偏态数据下线性回归模型的参数估计问题,针对缺失偏态数据,为克服样本分布扭曲缺点和提高模型的回归系数、尺度参数和偏度参数的估计效果,提出了一种适合偏态数据下线性回归模型中缺失数据的修正回归插补方法.通过随机模拟和实例研究,并与均值插补、回归插补、随机回归插补方法比较,结果表明所提出的修正回归插补方法是有效可行的.  相似文献   

18.
An extension of Kleffe–Rao model, an extended mixed model with random sampling variances, is considered. Empirical Bayes estimation is found to be very effective under such a model. The empirical Bayes estimators do not have a closed form. A second order Laplace approximation is proposed which works well for moderately large sample sizes. This approximation is specially useful when the uncertainties of the proposed empirical Bayes estimators are measured by the parametric bootstrap technique. A numerical example is considered to demonstrate the method.  相似文献   

19.
This article presents the calibration procedure of the two-phase randomized response (RR) technique for surveying the sensitive characteristic. When the sampling scheme is two-phase or double sampling, auxiliary information known from the entire population can be used, but the auxiliary information should be information available from both the first and second phases of the sample. If there is auxiliary information available from both the first and second phases, then we can improve the ordinary two-phase RR estimator by incorporating this information in the estimation procedure. In this article, we used the new two-step Newton's method for computing unknown constants in the calibration procedure and compared the efficiency of the proposed estimator through some numerical study.  相似文献   

20.
It is cleared in recent researches that the raising of missing values in datasets is inevitable. Imputation of missing data is one of the several methods which have been introduced to overcome this issue. Imputation techniques are trying to answer the case of missing data by covering missing values with reasonable estimates permanently. There are a lot of benefits for these procedures rather than their drawbacks. The operation of these methods has not been clarified, which means that they provide mistrust among analytical results. One approach to evaluate the outcomes of the imputation process is estimating uncertainty in the imputed data. Nonparametric methods are appropriate to estimating the uncertainty when data are not followed by any particular distribution. This paper deals with a nonparametric method for estimation and testing the significance of the imputation uncertainty, which is based on Wilcoxon test statistic, and which could be employed for estimating the precision of the imputed values created by imputation methods. This proposed procedure could be employed to judge the possibility of the imputation process for datasets, and to evaluate the influence of proper imputation methods when they are utilized to the same dataset. This proposed approach has been compared with other nonparametric resampling methods, including bootstrap and jackknife to estimate uncertainty in the imputed data under the Bayesian bootstrap imputation method. The ideas supporting the proposed method are clarified in detail, and a simulation study, which indicates how the approach has been employed in practical situations, is illustrated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号