首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In many applications, a finite population contains a large proportion of zero values that make the population distribution severely skewed. An unequal‐probability sampling plan compounds the problem, and as a result the normal approximation to the distribution of various estimators has poor precision. The central‐limit‐theorem‐based confidence intervals for the population mean are hence unsatisfactory. Complex designs also make it hard to pin down useful likelihood functions, hence a direct likelihood approach is not an option. In this paper, we propose a pseudo‐likelihood approach. The proposed pseudo‐log‐likelihood function is an unbiased estimator of the log‐likelihood function when the entire population is sampled. Simulations have been carried out. When the inclusion probabilities are related to the unit values, the pseudo‐likelihood intervals are superior to existing methods in terms of the coverage probability, the balance of non‐coverage rates on the lower and upper sides, and the interval length. An application with a data set from the Canadian Labour Force Survey‐2000 also shows that the pseudo‐likelihood method performs more appropriately than other methods. The Canadian Journal of Statistics 38: 582–597; 2010 © 2010 Statistical Society of Canada  相似文献   

2.
Misclassifications in binary responses have long been a common problem in medical and health surveys. One way to handle misclassifications in clustered or longitudinal data is to incorporate the misclassification model through the generalized estimating equation (GEE) approach. However, existing methods are developed under a non-survey setting and cannot be used directly for complex survey data. We propose a pseudo-GEE method for the analysis of binary survey responses with misclassifications. We focus on cluster sampling and develop analysis strategies for analyzing binary survey responses with different forms of additional information for the misclassification process. The proposed methodology has several attractive features, including simultaneous inferences for both the response model and the association parameters. Finite sample performance of the proposed estimators is evaluated through simulation studies and an application using a real dataset from the Canadian Longitudinal Study on Aging.  相似文献   

3.
Imputation is often used in surveys to treat item nonresponse. It is well known that treating the imputed values as observed values may lead to substantial underestimation of the variance of the point estimators. To overcome the problem, a number of variance estimation methods have been proposed in the literature, including resampling methods such as the jackknife and the bootstrap. In this paper, we consider the problem of doubly robust inference in the presence of imputed survey data. In the doubly robust literature, point estimation has been the main focus. In this paper, using the reverse framework for variance estimation, we derive doubly robust linearization variance estimators in the case of deterministic and random regression imputation within imputation classes. Also, we study the properties of several jackknife variance estimators under both negligible and nonnegligible sampling fractions. A limited simulation study investigates the performance of various variance estimators in terms of relative bias and relative stability. Finally, the asymptotic normality of imputed estimators is established for stratified multistage designs under both deterministic and random regression imputation. The Canadian Journal of Statistics 40: 259–281; 2012 © 2012 Statistical Society of Canada  相似文献   

4.
Generalised variance function (GVF) models are data analysis techniques often used in large‐scale sample surveys to approximate the design variance of point estimators for population means and proportions. Some potential advantages of the GVF approach include operational simplicity, more stable sampling errors estimates and providing a convenient method of summarising results when a high number of survey variables is considered. In this paper, several parametric and nonparametric methods for GVF estimation with binary variables are proposed and compared. The behavior of these estimators is analysed under heteroscedasticity and in the presence of outliers and influential observations. An empirical study based on the annual survey of living conditions in Galicia (a region in the northwest of Spain) illustrates the behaviour of the proposed estimators.  相似文献   

5.
Using survey weights, You & Rao [You and Rao, The Canadian Journal of Statistics 2002; 30, 431–439] proposed a pseudo‐empirical best linear unbiased prediction (pseudo‐EBLUP) estimator of a small area mean under a nested error linear regression model. This estimator borrows strength across areas through a linking model, and makes use of survey weights to ensure design consistency and preserve benchmarking property in the sense that the estimators add up to a reliable direct estimator of the mean of a large area covering the small areas. In this article, a second‐order approximation to the mean squared error (MSE) of the pseudo‐EBLUP estimator of a small area mean is derived. Using this approximation, an estimator of MSE that is nearly unbiased is derived; the MSE estimator of You & Rao [You and Rao, The Canadian Journal of Statistics 2002; 30, 431–439] ignored cross‐product terms in the MSE and hence it is biased. Empirical results on the performance of the proposed MSE estimator are also presented. The Canadian Journal of Statistics 38: 598–608; 2010 © 2010 Statistical Society of Canada  相似文献   

6.
Jun Shao 《Statistics》2013,47(3-4):203-237
This article reviews the applications of three resampling methods, the jackknife, the balanced repeated replication, and the bootstrap, in sample surveys. The sampling design under consideration is a stratified multistage sampling design. We discuss the implementation of the resampling methods; for example, the construction of balanced repeated replications and approximated balanced repeated replication estimators; four modified bootstrap algorithms to generate bootstrap samples; and three different ways of applying the resampling methods in the presence of imputed missing values. Asymptotic properties of the resampling estimators are discussed for two types of important survey estimators, functions of weighted averages and sample quantiles.  相似文献   

7.
Bayesian hierarchical formulations are utilized by the U.S. Bureau of Labor Statistics (BLS) with respondent‐level data for missing item imputation because these formulations are readily parameterized to capture correlation structures. BLS collects survey data under informative sampling designs that assign probabilities of inclusion to be correlated with the response on which sampling‐weighted pseudo posterior distributions are estimated for asymptotically unbiased inference about population model parameters. Computation is expensive and does not support BLS production schedules. We propose a new method to scale the computation that divides the data into smaller subsets, estimates a sampling‐weighted pseudo posterior distribution, in parallel, for every subset and combines the pseudo posterior parameter samples from all the subsets through their mean in the Wasserstein space of order 2. We construct conditions on a class of sampling designs where posterior consistency of the proposed method is achieved. We demonstrate on both synthetic data and in application to the Current Employment Statistics survey that our method produces results of similar accuracy as the usual approach while offering substantially faster computation.  相似文献   

8.
In a multilevel model for complex survey data, the weight‐inflated estimators of variance components can be biased. We propose a resampling method to correct this bias. The performance of the bias corrected estimators is studied through simulations using populations generated from a simple random effects model. The simulations show that, without lowering the precision, the proposed procedure can reduce the bias of the estimators, especially for designs that are both informative and have small cluster sizes. Application of these resampling procedures to data from an artificial workplace survey provides further evidence for the empirical value of this method. The Canadian Journal of Statistics 40: 150–171; 2012 © 2012 Statistical Society of Canada  相似文献   

9.
The evaluation of new processor designs is an important issue in electrical and computer engineering. Architects use simulations to evaluate designs and to understand trade‐offs and interactions among design parameters. However, due to the lengthy simulation time and limited resources, it is often practically impossible to simulate a full factorial design space. Effective sampling methods and predictive models are required. In this paper, the authors propose an automated performance predictive approach which employs an adaptive sampling scheme that interactively works with the predictive model to select samples for simulation. These samples are then used to build Bayesian additive regression trees, which in turn are used to predict the whole design space. Both real data analysis and simulation studies show that the method is effective in that, though sampling at very few design points, it generates highly accurate predictions on the unsampled points. Furthermore, the proposed model provides quantitative interpretation tools with which investigators can efficiently tune design parameters in order to improve processor performance. The Canadian Journal of Statistics 38: 136–152; 2010 © 2010 Statistical Society of Canada  相似文献   

10.
Abstract. A model‐based predictive estimator is proposed for the population proportions of a polychotomous response variable, based on a sample from the population and on auxiliary variables, whose values are known for the entire population. The responses for the non‐sample units are predicted using a multinomial logit model, which is a parametric function of the auxiliary variables. A bootstrap estimator is proposed for the variance of the predictive estimator, its consistency is proved and its small sample performance is compared with that of an analytical estimator. The proposed predictive estimator is compared with other available estimators, including model‐assisted ones, both in a simulation study involving different sampling designs and model mis‐specification, and using real data from an opinion survey. The results indicate that the prediction approach appears to use auxiliary information more efficiently than the model‐assisted approach.  相似文献   

11.
Marginal imputation, that consists of imputing items separately, generally leads to biased estimators of bivariate parameters such as finite population coefficients of correlation. To overcome this problem, two main approaches have been considered in the literature: the first consists of using customary imputation methods such as random hot‐deck imputation and adjusting for the bias at the estimation stage. This approach was studied in Skinner & Rao 2002 . In this paper, we extend the results of Skinner & Rao 2002 to the case of arbitrary sampling designs and three variants of random hot‐deck imputation. The second approach consists of using an imputation method, which preserves the relationship between variables. Shao & Wang 2002 proposed a joint random regression imputation procedure that succeeds in preserving the relationships between two study variables. One drawback of the Shao–Wang procedure is that it suffers from an additional variability (called the imputation variance) due to the random selection of residuals, resulting in potentially inefficient estimators. Following Chauvet, Deville, & Haziza 2011 , we propose a fully efficient version of the Shao–Wang procedure that preserves the relationship between two study variables, while virtually eliminating the imputation variance. Results of a simulation study support our findings. An application using data from the Workplace and Employees Survey is also presented. The Canadian Journal of Statistics 40: 124–149; 2012 © 2011 Statistical Society of Canada  相似文献   

12.
The paper investigates non-negative quadratic unbiased (NnQU) estimators of positive semi-definite quadratic forms, for use during the survey sampling of finite population values. It examines several different NnQU estimators of the variance of estimators of population total, under various sampling designs. It identifies an optimal quadratic unbiased estimator of the variance of the Horvitz-Thompson estimator of population total.  相似文献   

13.
Biased sampling occurs often in observational studies. With one biased sample, the problem of nonparametrically estimating both a target density function and a selection bias function is unidentifiable. This paper studies the nonparametric estimation problem when there are two biased samples that have some overlapping observations (i.e. recaptures) from a finite population. Since an intelligent subject sampled previously may experience a memory effect if sampled again, two general 2-stage models that incorporate both a selection bias and a possible memory effect are proposed. Nonparametric estimators of the target density, selection bias, and memory functions, as well as the population size are developed. Asymptotic properties of these estimators are studied and confidence bands for the selection function and memory function are provided. Our procedures are compared with those ignoring the memory effect or the selection bias in finite sample situations. A nonparametric model selection procedure is also given for choosing a model from the two 2-stage models and a mixture of these two models. Our procedures work well with or without a memory effect, and with or without a selection bias. The paper concludes with an application to a real survey data set.  相似文献   

14.
MODEL-ASSISTED HIGHER-ORDER CALIBRATION OF ESTIMATORS OF VARIANCE   总被引:1,自引:0,他引:1  
In survey sampling, interest often centres on inference for the population total using information about an auxiliary variable. The variance of the estimator used plays a key role in such inference. This study develops a new set of higher‐order constraints for the calibration of estimators of variance for various estimators of the population total. The proposed strategy requires an appropriate model for describing the relationship between the response and auxiliary variable, and the variance of the auxiliary variable. It is therefore referred to as a model‐assisted approach. Several new estimators of variance, including the higher‐order calibration estimators of the variance of the ratio and regression estimators suggested by Singh, Horn & Yu and Sitter & Wu are special cases of the proposed technique. The paper presents and discusses the results of an empirical study to compare the performance of the proposed estimators and existing counterparts.  相似文献   

15.
Matched case–control designs are commonly used in epidemiological studies for estimating the effect of exposure variables on the risk of a disease by controlling the effect of confounding variables. Due to retrospective nature of the study, information on a covariate could be missing for some subjects. A straightforward application of the conditional logistic likelihood for analyzing matched case–control data with the partially missing covariate may yield inefficient estimators of the parameters. A robust method has been proposed to handle this problem using an estimated conditional score approach when the missingness mechanism does not depend on the disease status. Within the conditional logistic likelihood framework, an empirical procedure is used to estimate the odds of the disease for the subjects with missing covariate values. The asymptotic distribution and the asymptotic variance of the estimator when the matching variables and the completely observed covariates are categorical. The finite sample performance of the proposed estimator is assessed through a simulation study. Finally, the proposed method has been applied to analyze two matched case–control studies. The Canadian Journal of Statistics 38: 680–697; 2010 © 2010 Statistical Society of Canada  相似文献   

16.
The stability of a slightly modified version of the usual jackknife variance estimator is evaluated exactly in small samples under a suitable linear regression model and compared with that of two different linearization variance estimators. Depending on the degree of heteroscedasticity of the error variance in the model, the stability of the jackknife variance estimator is found to be somewhat comparable to that of one or the other of the linearization variance estimators under conditions especially favorable to ratio estimation (i.e., regression approximately through the origin with a relatively small coefficient of variation in the x population). When these conditions do not hold, however, the jackknife variance estimator is found to be less stable than either of the linearization variance estimators.  相似文献   

17.
ABSTRACT

The present investigation deals with the problem of estimation of population mean in two-phase sampling. In the presence of two auxiliary variables, some classes of estimators have been proposed through predictive approach. Properties of the proposed classes of estimators have been studied, and the unbiased versions of these estimators along with their approximate variance expressions are obtained under simple random sampling without replacement scheme. The respective optimum strategies of the proposed estimators are discussed, and their empirical and graphical comparisons with some contemporary estimators of population mean have been made. Suitable recommendations to the survey practitioner are given.  相似文献   

18.
Summary.  The jackknife method is often used for variance estimation in sample surveys but has only been developed for a limited class of sampling designs. We propose a jackknife variance estimator which is defined for any without-replacement unequal probability sampling design. We demonstrate design consistency of this estimator for a broad class of point estimators. A Monte Carlo study shows how the proposed estimator may improve on existing estimators.  相似文献   

19.
In this paper, a new estimator for estimating the proportion of a potentially sensitive attribute in survey sampling has been introduced. The proposed estimator makes use of higher order moments of the scrambling variable at the estimation stage. The proposed estimator has been found to be more efficient than the estimator due to Kuk [1990. Asking sensitive questions indirectly. Biomerika 77(2), 436–438] and Franklin [1989. A comparison of estimators for randomized response sampling with continuous distributions from a dichotomous population. Comm. Statist. Theory Methods 18, 489–505] type estimators in randomized response sampling. Recently, Guerriero and Sandri [2007. A note on the comparison of some randomized response procedures. J. Statist. Plann. Inference 137, 2184–2190] have shown that the family of randomized response models proposed by Kuk [1990. Asking sensitive questions indirectly. Biomerika 77(2), 436–438] is better than the Simmons’ family in terms of efficiency and protection.  相似文献   

20.
This paper develops two sampling designs to create artificially stratified samples. These designs use a small set of experimental units to determine their relative ranks without measurement. In each set, the units are ranked by all available observers (rankers), with ties whenever the units cannot be ranked with high confidence. The rankings from all the observers are then combined in a meaningful way to create a single weight measure. This weight measure is used to create judgment strata in both designs. The first design constructs the strata through judgment post‐stratification after the data has been collected. The second design creates the strata before any measurements are made on the experimental units. The paper constructs estimators and confidence intervals, and develops testing procedures for the mean and median of the underlying distribution based on these sampling designs. We show that the proposed sampling designs provide a substantial improvement over their competitor designs in the literature. The Canadian Journal of Statistics 41: 304–324; 2013 © 2013 Statistical Society of Canada  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号