首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
Summary One of the most salient data problems empirical researchers face is the lack of informative responses in survey data. This contribution briefly surveys the literature on item nonresponse behavior and its determinants before it describes four approaches to address item nonresponse problems: Casewise deletion of observations, weighting, imputation, and model-based procedures. We describe the basic approaches, their strengths and weaknesses and illustrate some of their effects using a simulation study. The paper concludes with some recommendations for the applied researcher. We are grateful to an anonymous referee who provided helpful comments. Also we like to thank Donald B. Rubin for helpful comments and always motivating discussions as well as Ralf Münnich for inspiring discussions about raking procedures.  相似文献   

2.
Summary: This paper deals with item nonresponse on income questions in panel surveys and with longitudinal and cross–sectional imputation strategies to cope with this phenomenon. Using data from the German SOEP, we compare income inequality and mobility indicators based only on truly observed information to those derived from observed and imputed observations. First, we find a positive correlation between inequality and imputation. Secondly, income mobility appears to be significantly understated using observed information only. Finally, longitudinal analyses provide evidence for a positive inter–temporal correlation between item nonresponse and any kind of subsequent nonresponse.* We are grateful to two anonymous referees and to Jan Goebel for very helpful comments and suggestions on an earlier draft of this paper. The paper also benefited from discussions with seminar participants at the Workshop on Item Nonresponse and Data Quality in Large Social Surveys, Basel/CH, October 9–11, 2003.  相似文献   

3.
Useful properties of a general-purpose imputation method for numerical data are suggested and discussed in the context of several large government surveys. Imputation based on predictive mean matching is proposed as a useful extension of methods in existing practice, and versions of the method are presented for unit nonresponse and item nonresponse with a general pattern of missingness. Extensions of the method to provide multiple imputations are also considered. Pros and cons of weighting adjustments are discussed, and weighting-based analogs to predictive mean matching are outlined.  相似文献   

4.
Imputation is often used in surveys to treat item nonresponse. It is well known that treating the imputed values as observed values may lead to substantial underestimation of the variance of the point estimators. To overcome the problem, a number of variance estimation methods have been proposed in the literature, including resampling methods such as the jackknife and the bootstrap. In this paper, we consider the problem of doubly robust inference in the presence of imputed survey data. In the doubly robust literature, point estimation has been the main focus. In this paper, using the reverse framework for variance estimation, we derive doubly robust linearization variance estimators in the case of deterministic and random regression imputation within imputation classes. Also, we study the properties of several jackknife variance estimators under both negligible and nonnegligible sampling fractions. A limited simulation study investigates the performance of various variance estimators in terms of relative bias and relative stability. Finally, the asymptotic normality of imputed estimators is established for stratified multistage designs under both deterministic and random regression imputation. The Canadian Journal of Statistics 40: 259–281; 2012 © 2012 Statistical Society of Canada  相似文献   

5.
Resampling methods are a common measure to estimate the variance of a statistic of interest when data consist of nonresponse and imputation is used as compensation. Applying resampling methods usually means that subsamples are drawn from the original sample and that variance estimates are computed based on point estimators of several subsamples. However, newer resampling methods such as the rescaling bootstrap of Chipperfield and Preston [Efficient bootstrap for business surveys. Surv Methodol. 2007;33:167–172] include all elements of the original sample in the computation of its point estimator. Thus, procedures to consider imputation in resampling methods cannot be applied in the ordinary way. For such methods, modifications are necessary. This paper presents an approach applying newer resampling methods for imputed data. The Monte Carlo simulation study conducted in the paper shows that the proposed approach leads to reliable variance estimates in contrast to other modifications.  相似文献   

6.
To reduce nonresponse bias in sample surveys, a method of nonresponse weighting adjustment is often used which consists of multiplying the sampling weight of the respondent by the inverse of the estimated response probability. The authors examine the asymptotic properties of this estimator. They prove that it is generally more efficient than an estimator which uses the true response probability, provided that the parameters which govern this probability are estimated by maximum likelihood. The authors discuss variance estimation methods that account for the effect of using the estimated response probability; they compare their performances in a small simulation study. They also discuss extensions to the regression estimator.  相似文献   

7.
This study investigated the bias of factor loadings obtained from incomplete questionnaire data with imputed scores. Three models were used to generate discrete ordered rating scale data typical of questionnaires, also known as Likert data. These methods were the multidimensional polytomous latent trait model, a normal ogive item response theory model, and the discretized normal model. Incomplete data due to nonresponse were simulated using either missing completely at random or not missing at random mechanisms. Subsequently, for each incomplete data matrix, four imputation methods were applied for imputing item scores. Based on a completely crossed six-factor design, it was concluded that in general, bias was small for all data simulation methods and all imputation methods, and under all nonresponse mechanisms. Imputation method, two-way-plus-error, had the smallest bias in the factor loadings. Bias based on the discretized normal model was greater than that based on the other two models.  相似文献   

8.
Nonignorable nonresponse is a nonresponse mechanism that depends on the values of the variable having nonresponse. When an observed data of a binomial distribution suffer missing values from a nonignorable nonresponse mechanism, the binomial distribution parameters become unidentifiable without any other auxiliary information or assumption. To address the problems of non identifiability, existing methods mostly based on the log-linear regression model. In this article, we focus on the model when the nonresponse is nonignorable and we consider to use the auxiliary data to improve identifiability; furthermore, we derive the maximum likelihood estimator (MLE) for the binomial proportion and its associated variance. We present results for an analysis of real-life data from the SARS study in China. Finally, the simulation study shows that the proposed method gives promising results.  相似文献   

9.
Caren Hasler  Yves Tillé 《Statistics》2016,50(6):1310-1331
Random imputation is an interesting class of imputation methods to handle item nonresponse because it tends to preserve the distribution of the imputed variable. However, such methods amplify the total variance of the estimators because values are imputed at random. This increase in variance is called imputation variance. In this paper, we propose a new random hot-deck imputation method that is based on the k-nearest neighbour methodology. It replaces the missing value of a unit with the observed value of a similar unit. Calibration and balanced sampling are applied to minimize the imputation variance. Moreover, our proposed method provides triple protection against nonresponse bias. This means that if at least one out of three specified models holds, then the resulting total estimator is unbiased. Finally, our approach allows the user to perform consistency edits and to impute simultaneously.  相似文献   

10.
This article proposes and evaluates two new methods of reweighting preliminary data to obtain estimates more closely approximating those derived from the final data set. In our motivating example, the preliminary data are an early sample of tax returns, and the final data set is the sample after all tax returns have been processed. The new methods estimate a predicted propensity for late filing for each return in the advance sample and then poststratify based on these propensity scores. Using advance and complete sample data for 1982, we demonstrate that the new methods produce advance estimates generally much closer to the final estimates than those derived from the current advance estimation techniques. The results demonstrate the value of propensity modeling, a general-purpose methodology that can be applied to a wide range of problems, including adjustment for unit nonresponse and frame undercoverage as well as statistical matching.  相似文献   

11.
Response errors have become extremely important in increasingly complex surveys and a review of the ever expanding literature on the subject was judged necessary. The emphasis here is on models for response and nonresponse errors the most recent ones incorporating the concept of response probabilities.  相似文献   

12.
Summary: In this paper we examine the tendency for branching instructions to be ignored, misread, or otherwise not appropriately followed so that item nonresponse occurs for follow–up questions. The potential influence on branching errors of seven features of question complexity are examined, including high number of question words, high number of answer categories, last categories branch, all categories branch, write–in responses, location at the bottom of a page, and high distance between the answer box and branching instruction. A logistic regression analysis revealed that question complexity had a tendency to increase certain errors, but not others.* A more detailed version of this paper with additional analysis and discussion is available at http://www.sesrc.wsu.edu/dillman/. The opinions expressed here are those of the authors, not necessarily of the institutions where they presently work or the U.S. Census Bureau, which provided financial support for the collection of these data. We would like to thank Aref Dajani and Yves Thibaudeau for their advice on the analysis used in this paper.  相似文献   

13.
The occurrence of nonresponse is very much plebeian in surveys, which troubles the analysis, and hence, an inappropriate inference is left out. To counterbalance the sour effects of the incompleteness, fresh imputation techniques have been proposed with the aid of multi-auxiliary variates for the estimation of population mean on successive waves. Properties of the proposed estimators have been elaborated, and they have been compared with the work of Priyanka et al. (2015). Detailed simulation study is carried out to substantiate the empirical and theoretical results. Several possible cases have been addressed in which nonresponse can occur.  相似文献   

14.
Nonresponse is a major source of estimation error in sample surveys. The response rate is widely used to measure survey quality associated with nonresponse, but is inadequate as an indicator because of its limited relation with nonresponse bias. Schouten et al. (2009) proposed an alternative indicator, which they refer to as an indicator of representativeness or R-indicator. This indicator measures the variability of the probabilities of response for units in the population. This paper develops methods for the estimation of this R-indicator assuming that values of a set of auxiliary variables are observed for both respondents and nonrespondents. We propose bias adjustments to the point estimator proposed by Schouten et al. (2009) and demonstrate the effectiveness of this adjustment in a simulation study where it is shown that the method is valid, especially for smaller sample sizes. We also propose linearization variance estimators which avoid the need for computer-intensive replication methods and show good coverage in the simulation study even when models are not fully specified. The use of the proposed procedures is also illustrated in an application to two business surveys at Statistics Netherlands.  相似文献   

15.
We consider surveys with one or more callbacks and use a series of logistic regressions to model the probabilities of nonresponse at first contact and subsequent callbacks. These probabilities are allowed to depend on covariates as well as the categorical variable of interest and so the nonresponse mechanism is nonignorable. Explicit formulae for the score functions and information matrices are given for some important special cases to facilitate implementation of the method of scoring for obtaining maximum likelihood estimates of the model parameters. For estimating finite population quantities, we suggest the imputation and prediction approaches as alternatives to weighting adjustment. Simulation results suggest that the proposed methods work well in reducing the bias due to nonresponse. In our study, the imputation and prediction approaches perform better than weighting adjustment and they continue to perform quite well in simulations involving misspecified response models.  相似文献   

16.
Influential units occur frequently in surveys, especially in business surveys that collect economic variables whose distributions are highly skewed. A unit is said to be influential when its inclusion or exclusion from the sample has an important impact on the sampling error of estimates. We extend the concept of conditional bias attached to a unit and propose a robust version of the double expansion estimator, which depends on a tuning constant. We determine the tuning constant that minimizes the maximum estimated conditional bias. Our results can be naturally extended to the case of unit nonresponse, the set of respondents often being viewed as a second‐phase sample. A robust version of calibration estimators, based on auxiliary information available at both phases, is also constructed.  相似文献   

17.
Wilks's theorem is useful for constructing confidence regions. When applying the popular empirical likelihood to data with nonignorable nonresponses, Wilks's phenomenon does not hold. This paper unveils that this is caused by the extra estimation of the nuisance parameter in the nonignorable nonresponse propensity. Motivated by this result, we propose an adjusted empirical likelihood for which Wilks's theorem holds. Asymptotic results are presented and supplemented by simulation results for finite sample performance of the point estimators and confidence regions. An analysis of a data set is included for illustration.  相似文献   

18.
Important empirical information on household behavior and finances is obtained from surveys, and these data are used heavily by researchers, central banks, and for policy consulting. However, various interdependent factors that can be controlled only to a limited extent lead to unit and item nonresponse, and missing data on certain items is a frequent source of difficulties in statistical practice. More than ever, it is important to explore techniques for the imputation of large survey data. This paper presents the theoretical underpinnings of a Markov chain Monte Carlo multiple imputation procedure and outlines important technical aspects of the application of MCMC-type algorithms to large socio-economic data sets. In an illustrative application it is found that MCMC algorithms have good convergence properties even on large data sets with complex patterns of missingness, and that the use of a rich set of covariates in the imputation models has a substantial effect on the distributions of key financial variables.  相似文献   

19.
Recent developments in sample survey theory include the following topics: foundational aspects of inference, resampling methods for variance and confidence interval estimation, imputation for nonresponse and analysis of complex survey data. An overview and appraisal of some of these developments are presented.  相似文献   

20.
Many large-scale sample surveys use panel designs under which sampled individuals are interviewed several times before being dropped from the sample. The longitudinal data bases available from such surveys could be used to provide estimates of gross change over time. One problem in using these data to estimate gross change is how to handle the period-to-period nonresponse. This nonresponse is typically nonrandom and, furthermore, may be nonignorable in that it cannot be accounted for by other observed quantities in the data. Under the models proposed in this article, which are appropriate for the analysis of categorical data, the probability of nonresponse may be taken to be a function of the missing variable of interest. The proposed models are fit using maximum likelihood estimation. As an example, the method is applied to the problem of estimating gross flows in labor-force participation using data from the Current Population Survey and the Canadian Labour Force Survey.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号