首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 687 毫秒
1.
Imputation is often used in surveys to treat item nonresponse. It is well known that treating the imputed values as observed values may lead to substantial underestimation of the variance of the point estimators. To overcome the problem, a number of variance estimation methods have been proposed in the literature, including resampling methods such as the jackknife and the bootstrap. In this paper, we consider the problem of doubly robust inference in the presence of imputed survey data. In the doubly robust literature, point estimation has been the main focus. In this paper, using the reverse framework for variance estimation, we derive doubly robust linearization variance estimators in the case of deterministic and random regression imputation within imputation classes. Also, we study the properties of several jackknife variance estimators under both negligible and nonnegligible sampling fractions. A limited simulation study investigates the performance of various variance estimators in terms of relative bias and relative stability. Finally, the asymptotic normality of imputed estimators is established for stratified multistage designs under both deterministic and random regression imputation. The Canadian Journal of Statistics 40: 259–281; 2012 © 2012 Statistical Society of Canada  相似文献   

2.
Item non‐response in surveys occurs when some, but not all, variables are missing. Unadjusted estimators tend to exhibit some bias, called the non‐response bias, if the respondents differ from the non‐respondents with respect to the study variables. In this paper, we focus on item non‐response, which is usually treated by some form of single imputation. We examine the properties of doubly robust imputation procedures, which are those that lead to an estimator that remains consistent if either the outcome variable or the non‐response mechanism is adequately modelled. We establish the double robustness property of the imputed estimator of the finite population distribution function under random hot‐deck imputation within classes. We also discuss the links between our approach and that of Chambers and Dunstan. The results of a simulation study support our findings.  相似文献   

3.
Data Augmentation(DA)插补法是最常用的MCMC多重插补法之一。利用模拟方法研究基于DA插补法的线性回归模型的系数估计值,分析估计值的统计性质受无回答机制、无回答率和插补重数的影响。模拟结果显示:在完全随机无回答机制下,选择较小插补重数常常会得到较好的回归系数估计值;在随机无回答机制下,随着无回答率增大而选择更大插补重数往往会得到更好的回归系数估计值;在非随机无回答机制下,选择更大插补重数并不一定总会得到更好的回归系数估计值。  相似文献   

4.
Summary.  The paper develops a data augmentation method to estimate the distribution function of a variable, which is partially observed, under a non-ignorable missing data mechanism, and where surrogate data are available. An application to the estimation of hourly pay distributions using UK Labour Force Survey data provides the main motivation. In addition to considering a standard parametric data augmentation method, we consider the use of hot deck imputation methods as part of the data augmentation procedure to improve the robustness of the method. The method proposed is compared with standard methods that are based on an ignorable missing data mechanism, both in a simulation study and in the Labour Force Survey application. The focus is on reducing bias in point estimation, but variance estimation using multiple imputation is also considered briefly.  相似文献   

5.
Fractional regression hot deck imputation (FRHDI) imputes multiple values for each instance of a missing dependent variable. The imputed values are equal to the predicted value plus multiple random residuals. Fractional weights enable variance estimation and preserve correlations. In some circumstances with some starting weight values, existing procedures for computing FRHDI weights can produce negative values. We discuss procedures for constructing non-negative adjusted fractional weights for FRHDI and study performance of the algorithm using simulation. The algorithm can be used effectively with FRDHI procedures for handling missing data in the context of a complex sample survey.  相似文献   

6.
This paper concerns a method of estimation of variance components in a random effect linear model. It is mainly a resampling method and relies on the Jackknife principle. The derived estimators are presented as least squares estimators in an appropriate linear model, and one of them appears as a MINQUE (Minimum Norm Quadratic Unbiased Estimation) estimator. Our resampling method is illustrated by an example given by C. R. Rao [7] and some optimal properties of our estimator are derived for this example. In the last part, this method is used to derive an estimation of variance components in a random effect linear model when one of the components is assumed to be known.  相似文献   

7.
Two-phase sampling is a cost-effective method of data collection using outcome-dependent sampling for the second-phase sample. In order to make efficient use of auxiliary information and to improve domain estimation, mass imputation can be used in two-phase sampling. Rao and Sitter (1995) introduce mass imputation for two-phase sampling and its variance estimation under simple random sampling in both phases. In this paper, we extend the Rao–Sitter method to general sampling design. The proposed method is further extended to mass imputation for categorical data. A limited simulation study is performed to examine the performance of the proposed methods.  相似文献   

8.
In a general parametric setup, a multivariate regression model is considered when responses may be missing at random while the explanatory variables and covariates are completely observed. Asymptotic optimality properties of maximum likelihood estimators for such models are linked to the Fisher information matrix for the parameters. It is shown that the information matrix is well defined for the missing-at-random model and that it plays the same role as in the complete-data linear models. Applications of the methodologic developments in hypothesis-testing problems, without any imputation of missing data, are illustrated. Some simulation results comparing the proposed method with Rubin's multiple imputation method are presented.  相似文献   

9.
We propose an improved difference-cum-exponential ratio type estimator for estimating the finite population mean in simple and stratified random sampling using two auxiliary variables. We obtain properties of the estimators up to first order of approximation. The proposed class of estimators is found to be more efficient than the usual sample mean estimator, ratio estimator, exponential ratio type estimator, usual two difference type estimators, Rao (1991) estimator, Gupta and Shabbir (2008) estimator, and Grover and Kaur (2011) estimator. We use six real data sets in simple random sampling and two in stratified sampling for numerical comparisons.  相似文献   

10.
In this paper, we suggest three new ratio estimators of the population mean using quartiles of the auxiliary variable when there are missing data from the sample units. The suggested estimators are investigated under the simple random sampling method. We obtain the mean square errors equations for these estimators. The suggested estimators are compared with the sample mean and ratio estimators in the case of missing data. Also, they are compared with estimators in Singh and Horn [Compromised imputation in survey sampling, Metrika 51 (2000), pp. 267–276], Singh and Deo [Imputation by power transformation, Statist. Papers 45 (2003), pp. 555–579], and Kadilar and Cingi [Estimators for the population mean in the case of missing data, Commun. Stat.-Theory Methods, 37 (2008), pp. 2226–2236] and present under which conditions the proposed estimators are more efficient than other estimators. In terms of accuracy and of the coverage of the bootstrap confidence intervals, the suggested estimators performed better than other estimators.  相似文献   

11.
Sarjinder Singh 《Statistics》2013,47(5):499-511
In this paper, an alternative estimator of population mean in the presence of non-response has been suggested which comes in the form of Walsh's estimator. The estimator of mean obtained from the proposed technique remains better than the estimators obtained from ratio or mean methods of imputation. The mean-squared error (MSE) of the resultant estimator is less than that of the estimator obtained on the basis of ratio method of imputation for the optimum choice of parameters. An estimator for estimating a parameter involved in the process of new method of imputation has been discussed. A suggestion to form ‘warm deck’ method of imputation has been suggested. The MSE expressions for the proposed estimators have been derived analytically and compared empirically. The work has been extended to the case of multi-auxiliary information to be used for imputation. Numerical illustrations are also provided.  相似文献   

12.
Resampling methods are a common measure to estimate the variance of a statistic of interest when data consist of nonresponse and imputation is used as compensation. Applying resampling methods usually means that subsamples are drawn from the original sample and that variance estimates are computed based on point estimators of several subsamples. However, newer resampling methods such as the rescaling bootstrap of Chipperfield and Preston [Efficient bootstrap for business surveys. Surv Methodol. 2007;33:167–172] include all elements of the original sample in the computation of its point estimator. Thus, procedures to consider imputation in resampling methods cannot be applied in the ordinary way. For such methods, modifications are necessary. This paper presents an approach applying newer resampling methods for imputed data. The Monte Carlo simulation study conducted in the paper shows that the proposed approach leads to reliable variance estimates in contrast to other modifications.  相似文献   

13.
Using survey weights, You & Rao [You and Rao, The Canadian Journal of Statistics 2002; 30, 431–439] proposed a pseudo‐empirical best linear unbiased prediction (pseudo‐EBLUP) estimator of a small area mean under a nested error linear regression model. This estimator borrows strength across areas through a linking model, and makes use of survey weights to ensure design consistency and preserve benchmarking property in the sense that the estimators add up to a reliable direct estimator of the mean of a large area covering the small areas. In this article, a second‐order approximation to the mean squared error (MSE) of the pseudo‐EBLUP estimator of a small area mean is derived. Using this approximation, an estimator of MSE that is nearly unbiased is derived; the MSE estimator of You & Rao [You and Rao, The Canadian Journal of Statistics 2002; 30, 431–439] ignored cross‐product terms in the MSE and hence it is biased. Empirical results on the performance of the proposed MSE estimator are also presented. The Canadian Journal of Statistics 38: 598–608; 2010 © 2010 Statistical Society of Canada  相似文献   

14.
Donor imputation is frequently used in surveys. However, very few variance estimation methods that take into account donor imputation have been developed in the literature. This is particularly true for surveys with high sampling fractions using nearest donor imputation, often called nearest‐neighbour imputation. In this paper, the authors develop a variance estimator for donor imputation based on the assumption that the imputed estimator of a domain total is approximately unbiased under an imputation model; that is, a model for the variable requiring imputation. Their variance estimator is valid, irrespective of the magnitude of the sampling fractions and the complexity of the donor imputation method, provided that the imputation model mean and variance are accurately estimated. They evaluate its performance in a simulation study and show that nonparametric estimation of the model mean and variance via smoothing splines brings robustness with respect to imputation model misspecifications. They also apply their variance estimator to real survey data when nearest‐neighbour imputation has been used to fill in the missing values. The Canadian Journal of Statistics 37: 400–416; 2009 © 2009 Statistical Society of Canada  相似文献   

15.
In this paper, bias-adjustment in the jackknife estimator of variance accredited to Rao and Sitter (1995) has been considered. Then the bias-adjusted Rao and Sitter (1995) estimator has been calibrated such that its expected value under the imputing superpopulation model remains the same as the expected value of the mean squared error of the ratio estimator in the presence of non-response. A simulation study has been performed to compare the six different estimators of variance: out of them four estimators belong to Rao and Sitter (1995) and the other two proposed estimators are named as bias-adjusted and bias-adjusted-cum-calibrated estimators. The empirical relative bias and empirical relative efficiency of the two proposed estimators with respect to the four existing estimators accredited to Rao and Sitter (1995) have been investigated through simulations. The bias-adjusted-cum-calibrated estimator has been found to be an efficient estimator in the case of heteroscadastic populations. The present paper considers the situation of simple random and without replacement sampling. The possibility of obtaining a negative estimate of variance by the estimator due to Kim et al. (2006) has been pointed out.  相似文献   

16.
Summary.  We consider three sorts of diagnostics for random imputations: displays of the completed data, which are intended to reveal unusual patterns that might suggest problems with the imputations, comparisons of the distributions of observed and imputed data values and checks of the fit of observed data to the model that is used to create the imputations. We formulate these methods in terms of sequential regression multivariate imputation, which is an iterative procedure in which the missing values of each variable are randomly imputed conditionally on all the other variables in the completed data matrix. We also consider a recalibration procedure for sequential regression imputations. We apply these methods to the 2002 environmental sustainability index, which is a linear aggregation of 64 environmental variables on 142 countries.  相似文献   

17.
J. Kleffe 《Statistics》2013,47(2):233-250
The subject of this contribution is to present a survey on new methods for variance component estimation, which appeared in the literature in recent years. Starting from mixed models treated in analysis of variance research work on this field turned over to a more general approach in which the covariance matrix of the vector of observations is assumed to be a unknown linear combination of known symmetric matrices. Much interest has been shown in developing some kinds op optimal estimators for the unknown parameters and most results were obtained for estimators being invariant with respect to a certain group of translations. Therefore we restrict attention to this class of estimates. We will deal with minimum variance unbiased estimators, least squared errors estimators, maximum likelihood estimators. Bayes quadratic estimators and show some relations to the mimimum norm quadratic unbiased estimation principle (MINQUE) introduced by C. R. Rao [20]. We do not mention the original motivation of MINQUE since the otion of minimum norm depends on a measure that is not accepted by all statisticians. Also we do‘nt deal with other approaches like the BAYEsian and fiducial methods which were successfully applied by S. Portnoy [18], P. Rusolph [22], G. C. Tiao, W. Y. Tan [28], M. J. K. Healy [9] and others, although in very special situations, only. Additionally we add some new results and also new insight in the properties of known estimators. We give a new characterization of MINQUE in the class of all estimators, extend explicite expressions for locally optimal quadratic estimators given by C. R. Rao [22] to a slightly more general situation and prove complete class theorems useful for the computation of BAYES quadratic estimators. We also investigate situations in which BAYES quadratic unbiased estimators do'nt change if the distribution of the error terms differ from the normal distribution.  相似文献   

18.
This article develops three empirical likelihood (EL) approaches to estimate parameters in nonlinear regression models in the presence of nonignorable missing responses. These are based on the inverse probability weighted (IPW) method, the augmented IPW (AIPW) method and the imputation technique. A logistic regression model is adopted to specify the propensity score. Maximum likelihood estimation is used to estimate parameters in the propensity score by combining the idea of importance sampling and imputing estimating equations. Under some regularity conditions, we obtain the asymptotic properties of the maximum EL estimators of these unknown parameters. Simulation studies are conducted to investigate the finite sample performance of our proposed estimation procedures. Empirical results provide evidence that the AIPW procedure exhibits better performance than the other two procedures. Data from a survey conducted in 2002 are used to illustrate the proposed estimation procedure. The Canadian Journal of Statistics 48: 386–416; 2020 © 2020 Statistical Society of Canada  相似文献   

19.
This article addresses issues in creating public-use data files in the presence of missing ordinal responses and subsequent statistical analyses of the dataset by users. The authors propose a fully efficient fractional imputation (FI) procedure for ordinal responses with missing observations. The proposed imputation strategy retrieves the missing values through the full conditional distribution of the response given the covariates and results in a single imputed data file that can be analyzed by different data users with different scientific objectives. Two most critical aspects of statistical analyses based on the imputed data set,  validity  and  efficiency, are examined through regression analysis involving the ordinal response and a selected set of covariates. It is shown through both theoretical development and simulation studies that, when the ordinal responses are missing at random, the proposed FI procedure leads to valid and highly efficient inferences as compared to existing methods. Variance estimation using the fractionally imputed data set is also discussed. The Canadian Journal of Statistics 48: 138–151; 2020 © 2019 Statistical Society of Canada  相似文献   

20.
Horvitz and Thompson's (HT) [1952. A generalization of sampling without replacement from a finite universe. J. Amer. Statist. Assoc. 47, 663–685] well-known unbiased estimator for a finite population total admits an unbiased estimator for its variance as given by [Yates and Grundy, 1953. Selection without replacement from within strata with probability proportional to size. J. Roy. Statist. Soc. B 15, 253–261], provided the parent sampling design involves a constant number of distinct units in every sample to be chosen. If the design, in addition, ensures uniform non-negativity of this variance estimator, Rao and Wu [1988. Resampling inference with complex survey data. J. Amer. Statist. Assoc. 83, 231–241] have given their re-scaling bootstrap technique to construct confidence interval and to estimate mean square error for non-linear functions of finite population totals of several real variables. Horvitz and Thompson's estimators (HTE) are used to estimate the finite population totals. Since they need to equate the bootstrap variance of the bootstrap estimator to the Yates and Grundy's estimator (YGE) for the variance of the HTE in case of a single variable, i.e., in the linear case the YG variance estimator is required to be positive for the sample usually drawn.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号