期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Asymptotic theory and inference of predictive mean matching imputation using a superpopulation model framework

Shu Yang Jae Kwang Kim 《Scandinavian Journal of Statistics》2020,47(3):839-861

Predictive mean matching imputation is popular for handling item nonresponse in survey sampling. In this article, we study the asymptotic properties of the predictive mean matching estimator for finite-population inference using a superpopulation model framework. We also clarify conditions for its robustness. For variance estimation, the conventional bootstrap inference is invalid for matching estimators with a fixed number of matches due to the nonsmoothness nature of the matching estimator. We propose a new replication variance estimator, which is asymptotically valid. The key strategy is to construct replicates directly based on the linear terms of the martingale representation for the matching estimator, instead of individual records of variables. Simulation studies confirm that the proposed method provides valid inference. 相似文献

2.

Semiparametric predictive mean matching 总被引：1，自引：0，他引：1

Marco Di Zio Ugo Guarnera 《AStA Advances in Statistical Analysis》2009,93(2):175-186

Predictive mean matching is an imputation method that combines parametric and nonparametric techniques. It imputes missing values by means of the Nearest Neighbor Donor with distance based on the expected values of the missing variables conditional on the observed covariates, instead of computing the distance directly on the values of the covariates. In ordinary predictive mean matching the expected values are computed through a linear regression model. In this paper a generalization of the original predictive mean matching is studied. Here the expected values used for computing the distance are estimated through an approach based on Gaussian mixture models. This approach includes as a special case the original predictive mean matching but allows one to deal also with nonlinear relationships among the variables. In order to assess its performance, an empirical evaluation based on simulations is carried out. 相似文献

3.

Multiple Imputation of Predictor Variables Using Generalized Additive Models

Roel de Jong Stef van Buuren 《统计学通讯:模拟与计算》2016,45(3):968-985

The sensitivity of multiple imputation methods to deviations from their distributional assumptions is investigated using simulations, where the parameters of scientific interest are the coefficients of a linear regression model, and values in predictor variables are missing at random. The performance of a newly proposed imputation method based on generalized additive models for location, scale, and shape (GAMLSS) is investigated. Although imputation methods based on predictive mean matching are virtually unbiased, they suffer from mild to moderate under-coverage, even in the experiment where all variables are jointly normal distributed. The GAMLSS method features better coverage than currently available methods. 相似文献

4.

Imputation techniques for incomplete data in quadratic discriminant analysis

《Journal of Statistical Computation and Simulation》2012,82(6):863-877

We have compared the efficacy of five imputation algorithms readily available in SAS for the quadratic discriminant function. Here, we have generated several different parametric-configuration training data with missing data, including monotone missing-at-random observations, and used a Monte Carlo simulation to examine the expected probabilities of misclassification for the two-class quadratic statistical discrimination problem under five different imputation methods. Specifically, we have compared the efficacy of the complete observation-only method and the mean substitution, regression, predictive mean matching, propensity score, and Markov Chain Monte Carlo (MCMC) imputation methods. We found that the MCMC and propensity score multiple imputation approaches are, in general, superior to the other imputation methods for the configurations and training-sample sizes we considered. 相似文献

5.

On robust causality nonresponse testing in duration studies under the Cox model

Tadeusz Bednarski 《Statistical Papers》2014,55(1):221-231

High survey nonresponse in unemployment duration studies may have a strong effect on inference if the so called causal mechanism is present. A robust method of testing the causal nonresponse is proposed for data sets where survey information can be combined with complete administrative records. It is assumed that population distribution follows approximately the Cox regression model. Formal justification of the method and a comparative simulation study are included. 相似文献

6.

Likelihood-based confidence sets for partially identified parameters

Zhiwei Zhang 《Journal of statistical planning and inference》2009

There has been growing interest in partial identification of probability distributions and parameters. This paper considers statistical inference on parameters that are partially identified because data are incompletely observed, due to nonresponse or censoring, for instance. A method based on likelihood ratios is proposed for constructing confidence sets for partially identified parameters. The method can be used to estimate a proportion or a mean in the presence of missing data, without assuming missing-at-random or modeling the missing-data mechanism. It can also be used to estimate a survival probability with censored data without assuming independent censoring or modeling the censoring mechanism. A version of the verification bias problem is studied as well. 相似文献

7.

ESTIMATION PROCEDURES FOR CATEGORICAL SURVEY DATA WITH NONIGNORABLE NONRESPONSE

《统计学通讯:理论与方法》2013,42(4):643-663

We consider surveys with one or more callbacks and use a series of logistic regressions to model the probabilities of nonresponse at first contact and subsequent callbacks. These probabilities are allowed to depend on covariates as well as the categorical variable of interest and so the nonresponse mechanism is nonignorable. Explicit formulae for the score functions and information matrices are given for some important special cases to facilitate implementation of the method of scoring for obtaining maximum likelihood estimates of the model parameters. For estimating finite population quantities, we suggest the imputation and prediction approaches as alternatives to weighting adjustment. Simulation results suggest that the proposed methods work well in reducing the bias due to nonresponse. In our study, the imputation and prediction approaches perform better than weighting adjustment and they continue to perform quite well in simulations involving misspecified response models. 相似文献

8.

Modeling Nonignorable Nonresponse in Categorical Panel Data With an Example in Estimating Gross Labor-Force Flows

Elizabeth A. Stasny 《商业与经济统计学杂志》2013,31(2):207-219

Many large-scale sample surveys use panel designs under which sampled individuals are interviewed several times before being dropped from the sample. The longitudinal data bases available from such surveys could be used to provide estimates of gross change over time. One problem in using these data to estimate gross change is how to handle the period-to-period nonresponse. This nonresponse is typically nonrandom and, furthermore, may be nonignorable in that it cannot be accounted for by other observed quantities in the data. Under the models proposed in this article, which are appropriate for the analysis of categorical data, the probability of nonresponse may be taken to be a function of the missing variable of interest. The proposed models are fit using maximum likelihood estimation. As an example, the method is applied to the problem of estimating gross flows in labor-force participation using data from the Current Population Survey and the Canadian Labour Force Survey. 相似文献

9.

Robust inference for estimating equations with nonignorably missing data based on SIR algorithm

Yunquan Song Yanji Zhu Xiuli Wang Lu Lin 《Journal of Statistical Computation and Simulation》2019,89(17):3196-3212

Nonresponse is a very common phenomenon in survey sampling. Nonignorable nonresponse – that is, a response mechanism that depends on the values of the variable having nonresponse – is the most difficult type of nonresponse to handle. This article develops a robust estimation approach to estimating equations (EEs) by incorporating the modelling of nonignorably missing data, the generalized method of moments (GMM) method and the imputation of EEs via the observed data rather than the imputed missing values when some responses are subject to nonignorably missingness. Based on a particular semiparametric logistic model for nonignorable missing response, this paper proposes the modified EEs to calculate the conditional expectation under nonignorably missing data. We can apply the GMM to infer the parameters. The advantage of our method is that it replaces the non-parametric kernel-smoothing with a parametric sampling importance resampling (SIR) procedure to avoid nonparametric kernel-smoothing problems with high dimensional covariates. The proposed method is shown to be more robust than some current approaches by the simulations. 相似文献

10.

Using Auxiliary Data for Binomial Parameter Estimation with Nonignorable Nonresponse

Xueli Wang Hua Chen Zhi Geng Xiaohua Zhou 《统计学通讯:理论与方法》2013,42(19):3468-3478

Nonignorable nonresponse is a nonresponse mechanism that depends on the values of the variable having nonresponse. When an observed data of a binomial distribution suffer missing values from a nonignorable nonresponse mechanism, the binomial distribution parameters become unidentifiable without any other auxiliary information or assumption. To address the problems of non identifiability, existing methods mostly based on the log-linear regression model. In this article, we focus on the model when the nonresponse is nonignorable and we consider to use the auxiliary data to improve identifiability; furthermore, we derive the maximum likelihood estimator (MLE) for the binomial proportion and its associated variance. We present results for an analysis of real-life data from the SARS study in China. Finally, the simulation study shows that the proposed method gives promising results. 相似文献

11.

Nonresponse Assessment of a Consumer Price Index

H. M. P. Kersten 《商业与经济统计学杂志》2013,31(4):336-343

A household budget survey often suffers from a high nonresponse rate and a selective response. The bias that may be introduced in the estimation of budget shares because of this nonresponse can affect the estimate of a consumer price index, which is a weighted sum of partial price index numbers (weighted with the estimated budget shares). The bias is especially important when related to the standard error of the estimate. Because of the impossibility of subsampling nonrespondents to the budget survey, no exact information on the bias can be obtained. To evaluate the nonresponse bias, bounds for this bias are calculated using linear programming methods for several assumptions. The impact on a price index of a high nonresponse rate among people with a high income can also be assessed by using the elasticity with respect to total expenditure. Attention is also given to the possible nonresponse bias in a time series of price index numbers. The possible nonresponse bias is much larger than the standard error of the estimate. 相似文献

12.

Response probability estimation

《Journal of statistical planning and inference》1997,59(1):111-126

This paper extends the ideas in Giommi (Proc. 45th Session of the Internat. Statistical Institute, Vol. 2 (1985) 577–578; Techniques d'enquête 13(2) (1987) 137–144) and, in Särndal and Swenson (Bull. Int. Statist. Inst. 15(2) (1985) 1–16; Int. Statist. Rev. 55(1987) 279–294). Given the parallel between a ‘three-phase sampling’ and a ‘sampling with subsequent unit and item nonresponse’, we apply results from three-phase sampling theory to nonresponse situation. To handle the practical problem of unknown distributions at the second and the third phases of selection (the response mechanisms) in the nonresponse case, we use two approaches of response probability estimation: response homogeneity groups (RHG) model (Särndal and Swenson, 1985, 1987) and the nonparametric estimation (Giommi, 1985, 1987). To motivate the three-phase selection, imputation procedures for item nonresponse are used with the RHG model for unit nonresponse. By means of a Monte Carlo study, we find that the regression-type estimators are the most precise of those studied under the two approaches of response probability estimation in terms of lower bias, mean square error and variance; variance estimator close to the true variance and achieved coverage rates closer to the nominal levels. The simulation study shows how poor the variance estimators are under the single imputation approach currently used to handle the problem of missing values. 相似文献

13.

Assessing the impact of initial nonresponse and attrition in the analysis of unemployment duration with panel surveys

Marjo Pyy-Martikainen Ulrich Rendtel 《AStA Advances in Statistical Analysis》2008,92(3):297-318

We show how register data combined at person-level with survey data can be used to conduct a novel type of nonresponse analysis in a panel survey. The availability of register data provides a unique opportunity to directly test the type of the missingness mechanism as well as estimate the size of bias due to initial nonresponse and attrition. We are also able to study in-depth the determinants of initial nonresponse and attrition. We use the Finnish subset of the European Community Household Panel (FI ECHP) data combined with register panel data and unemployment spells as outcome variables of interest. Our results show that initial nonresponse and attrition are clearly different processes driven by different background variables. Both the initial nonresponse and attrition mechanisms are nonignorable with respect to analysis of unemployment spells. Finally, our results suggest that initial nonresponse may play a role at least as important as attrition in causing bias. This result challenges the common view of attrition being the main threat to the value of panel data. 相似文献

14.

A fresh approach for intercession of nonresponse in multivariate longitudinal designs

Kumari Priyanka Richa Mittal 《统计学通讯:理论与方法》2017,46(18):9303-9323

The occurrence of nonresponse is very much plebeian in surveys, which troubles the analysis, and hence, an inappropriate inference is left out. To counterbalance the sour effects of the incompleteness, fresh imputation techniques have been proposed with the aid of multi-auxiliary variates for the estimation of population mean on successive waves. Properties of the proposed estimators have been elaborated, and they have been compared with the work of Priyanka et al. (2015). Detailed simulation study is carried out to substantiate the empirical and theoretical results. Several possible cases have been addressed in which nonresponse can occur. 相似文献

15.

Nonresponse weighting adjustment using estimated response probability

Jae Kwang Kim Jay J. Kim 《Revue canadienne de statistique》2007,35(4):501-514

To reduce nonresponse bias in sample surveys, a method of nonresponse weighting adjustment is often used which consists of multiplying the sampling weight of the respondent by the inverse of the estimated response probability. The authors examine the asymptotic properties of this estimator. They prove that it is generally more efficient than an estimator which uses the true response probability, provided that the parameters which govern this probability are estimated by maximum likelihood. The authors discuss variance estimation methods that account for the effect of using the estimated response probability; they compare their performances in a small simulation study. They also discuss extensions to the regression estimator. 相似文献

16.

Estimation of finite population kurtosis under two-phase sampling for nonresponse

Wojciech Gamrot 《Statistical Papers》2012,53(4):887-894

In this paper an estimator of finite population kurtosis computed under the two-phase sampling for nonresponse is proposed. The formulas characterizing its asymptotic properties are derived using Taylor linearization technique for the general situation of arbitrary sampling designs in both phases and stochastic nonresponse represented by arbitrary response distribution. An important special case of simple random sampling without replacement and deterministic nonresponse is also considered. 相似文献

17.

A note on nonexistence of posterior moments

Dongchu Sun Paul L. Speckman 《Revue canadienne de statistique》2005,33(4):591-601

Bayesian analyses often take for granted the assumption that the posterior distribution has at least a first moment. They often include computed or estimated posterior means. In this note, the authors show an example of a Weibull distribution parameter where the theoretical posterior mean fails to exist for commonly used proper semi–conjugate priors. They also show that posterior moments can fail to exist with commonly used noninformative priors including Jeffreys, reference and matching priors, despite the fact that the posteriors are proper. Moreover, within a broad class of priors, the predictive distribution also has no mean. The authors illustrate the problem with a simulated example. Their results demonstrate that the unwitting use of estimated posterior means may yield unjustified conclusions. 相似文献

18.

Projecting From Advance Data Using Propensity Modeling: An Application to Income and Tax Statistics

John L. Czajka Sharon M. Hirabayashi Roderick J. A. Little Donald B. Rubin 《商业与经济统计学杂志》2013,31(2):117-131

This article proposes and evaluates two new methods of reweighting preliminary data to obtain estimates more closely approximating those derived from the final data set. In our motivating example, the preliminary data are an early sample of tax returns, and the final data set is the sample after all tax returns have been processed. The new methods estimate a predicted propensity for late filing for each return in the advance sample and then poststratify based on these propensity scores. Using advance and complete sample data for 1982, we demonstrate that the new methods produce advance estimates generally much closer to the final estimates than those derived from the current advance estimation techniques. The results demonstrate the value of propensity modeling, a general-purpose methodology that can be applied to a wide range of problems, including adjustment for unit nonresponse and frame undercoverage as well as statistical matching. 相似文献

19.

Multiobjective Stochastic Multivariate Stratified Sampling in Presence of Nonresponse

Sanam Haseen Abdul Bari 《统计学通讯:模拟与计算》2016,45(8):2810-2826

The case of nonresponse in multivariate stratified sampling survey was first introduced by Hansen and Hurwitz in 1946 considering the sampling variances and costs to be deterministic. However, in real life situations sampling variance and cost are often random (stochastic) and have probability distributions. In this article, we have formulated the multivariate stratified sampling in the presence of nonresponse with random sampling variances and costs as a multiobjective stochastic programming problem. Here, the sampling variance and costs are considered random and converted into a deterministic NLPP by using chance constraint and modified E-model. A solution procedure using three different approaches are adopted viz. goal programming, fuzzy programming, and D₁ distance method to obtain the compromise allocation for the formulated problem. An empirical study has also been provided to illustrate the computational details. 相似文献

20.

Balanced k-nearest neighbour imputation

Caren Hasler Yves Tillé 《Statistics》2016,50(6):1310-1331

Random imputation is an interesting class of imputation methods to handle item nonresponse because it tends to preserve the distribution of the imputed variable. However, such methods amplify the total variance of the estimators because values are imputed at random. This increase in variance is called imputation variance. In this paper, we propose a new random hot-deck imputation method that is based on the k-nearest neighbour methodology. It replaces the missing value of a unit with the observed value of a similar unit. Calibration and balanced sampling are applied to minimize the imputation variance. Moreover, our proposed method provides triple protection against nonresponse bias. This means that if at least one out of three specified models holds, then the resulting total estimator is unbiased. Finally, our approach allows the user to perform consistency edits and to impute simultaneously. 相似文献