首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
A systematic procedure for the derivation of linearized variables for the estimation of sampling errors of complex nonlinear statistics involved in the analysis of poverty and income inequality is developed. The linearized variable extends the use of standard variance estimation formulae, developed for linear statistics such as sample aggregates, to nonlinear statistics. The context is that of cross-sectional samples of complex design and reasonably large size, as typically used in population-based surveys. Results of application of the procedure to a wide range of poverty and inequality measures are presented. A standardized software for the purpose has been developed and can be provided to interested users on request. Procedures are provided for the estimation of the design effect and its decomposition into the contribution of unequal sample weights and of other design complexities such as clustering and stratification. The consequence of treating a complex statistic as a simple ratio in estimating its sampling error is also quantified. The second theme of the paper is to compare the linearization approach with an alternative approach based on the concept of replication, namely the Jackknife repeated replication (JRR) method. The basis and application of the JRR method is described, the exposition paralleling that of the linearization method but in somewhat less detail. Based on data from an actual national survey, estimates of standard errors and design effects from the two methods are analysed and compared. The numerical results confirm that the two alternative approaches generally give very similar results, though notable differences can exist for certain statistics. Relative advantages and limitations of the approaches are identified.  相似文献   

2.
In this paper we present methods for inference on data selected by a complex sampling design for a class of statistical models for the analysis of ordinal variables. Specifically, assuming that the sampling scheme is not ignorable, we derive for the class of cub models (Combination of discrete Uniform and shifted Binomial distributions) variance estimates for a complex two stage stratified sample. Both Taylor linearization and repeated replication variance estimators are presented. We also provide design‐based test diagnostics and goodness‐of‐fit measures. We illustrate by means of real data analysis the differences between survey‐weighted and unweighted point estimates and inferences for cub model parameters.  相似文献   

3.
《统计学通讯:理论与方法》2012,41(16-17):3278-3300
Under complex survey sampling, in particular when selection probabilities depend on the response variable (informative sampling), the sample and population distributions are different, possibly resulting in selection bias. This article is concerned with this problem by fitting two statistical models, namely: the variance components model (a two-stage model) and the fixed effects model (a single-stage model) for one-way analysis of variance, under complex survey design, for example, two-stage sampling, stratification, and unequal probability of selection, etc. Classical theory underlying the use of the two-stage model involves simple random sampling for each of the two stages. In such cases the model in the sample, after sample selection, is the same as model for the population; before sample selection. When the selection probabilities are related to the values of the response variable, standard estimates of the population model parameters may be severely biased, leading possibly to false inference. The idea behind the approach is to extract the model holding for the sample data as a function of the model in the population and of the first order inclusion probabilities. And then fit the sample model, using analysis of variance, maximum likelihood, and pseudo maximum likelihood methods of estimation. The main feature of the proposed techniques is related to their behavior in terms of the informativeness parameter. We also show that the use of the population model that ignores the informative sampling design, yields biased model fitting.  相似文献   

4.
Cross-classified data are often obtained in controlled experimental situations and in epidemiologic studies. As an example of the latter, occupational health studies sometimes require personal exposure measurements on a random sample of workers from one or more job groups, in one or more plant locations, on several different sampling dates. Because the marginal distributions of exposure data from such studies are generally right-skewed and well-approximated as lognormal, researchers in this area often consider the use of ANOVA models after a logarithmic transformation. While it is then of interest to estimate original-scale population parameters (e.g., the overall mean and variance), standard candidates such as maximum likelihood estimators (MLEs) can be unstable and highly biased. Uniformly minimum variance unbiased (UMVU) cstiniators offer a viable alternative, and are adaptable to sampling schemes that are typiral of experimental or epidemiologic studies. In this paper, we provide UMVU estimators for the mean and variance under two random effects ANOVA models for logtransformed data. We illustrate substantial mean squared error gains relative to the MLE when estimating the mean under a one-way classification. We illustrate that the results can readily be extended to encompass a useful class of purely random effects models, provided that the study data are balanced.  相似文献   

5.
Reporting sampling errors of survey estimates is a problem that is commonly addressed when compiling a survey report. Because of the vast number of study variables or population characteristics and of interest domains in a survey, it is almost impossible to calculate and to publish the standard errors for each statistic. A way of overcoming such problem would be to estimate indirectly the sampling errors by using generalized variance functions, which define a statistical relationship between the sampling errors and the corresponding estimates. One of the problems with this approach is that the model specification has to be consistent with a roughly constant design effect. If the design effects vary greatly across estimates, as in the case of the Business Surveys, the prediction model is not correctly specified and the least-square estimation is biased. In this paper, we show an extension of the generalized variance functions, which address the above problems, which could be used in contexts similar to those encountered in Business Surveys. The proposed method has been applied to the Italian Structural Business Statistics Survey case.  相似文献   

6.
Not having a variance estimator is a seriously weak point of a sampling design from a practical perspective. This paper provides unbiased variance estimators for several sampling designs based on inverse sampling, both with and without an adaptive component. It proposes a new design, which is called the general inverse sampling design, that avoids sampling an infeasibly large number of units. The paper provide estimators for this design as well as its adaptive modification. A simple artificial example is used to demonstrate the computations. The adaptive and non‐adaptive designs are compared using simulations based on real data sets. The results indicate that, for appropriate populations, the adaptive version can have a substantial variance reduction compared with the non‐adaptive version. Also, adaptive general inverse sampling with a limitation on the initial sample size has a greater variance reduction than without the limitation.  相似文献   

7.
Several variance estimators using auxiliary information were compared for estimating the variance of the total or ratio under a one unit per stratum sample design. The auxiliary information used consisted of data from the 1977 and 1982 Economic Census. One thousand probability proportional to size one primary sampling unit per stratum samples were selected for the Monte Carlo study of characteristics of interest in a content evaluation survey as part of the 1982 Economic Census program. Six variance estimators were applied to each sample and their bias and mean square errors were evaluated. The results are strikingly different between the variance estimators of the estimated total and ratio.  相似文献   

8.
The variance of the sampling distribution of the sample mean is derived for two sampling designs in which a single cluster is randomly drawn from an autocorrelated population. The derivations are motivated by potential applications to statistical quality control, where a "one-cluster" sampling design may often be used because of ease of implementation, and where it is likely that process output is autocorrelated Scenarios in statistical process control for which either non-overlapping or overlapping clusters are appropriate are described The sampling design variance under non-overlapping clusters is related to the sampling design variance under overlapping clusters through the use of a circular population.  相似文献   

9.
Summary.  Complex survey sampling is often used to sample a fraction of a large finite population. In general, the survey is conducted so that each unit (e.g. subject) in the sample has a different probability of being selected into the sample. For generalizability of the sample to the population, both the design and the probability of being selected into the sample must be incorporated in the analysis. In this paper we focus on non-standard regression models for complex survey data. In our motivating example, which is based on data from the Medical Expenditure Panel Survey, the outcome variable is the subject's 'total health care expenditures in the year 2002'. Previous analyses of medical cost data suggest that the variance is approximately equal to the mean raised to the power of 1.5, which is a non-standard variance function. Currently, the regression parameters for this model cannot be easily estimated in standard statistical software packages. We propose a simple two-step method to obtain consistent regression parameter and variance estimates; the method proposed can be implemented within any standard sample survey package. The approach is applicable to complex sample surveys with any number of stages.  相似文献   

10.
This paper illustrates the use of multilevel statistical modelling of cross-classified data to explore interviewers' influence on survey non-response. The results suggest that the variability in whole household refusal and non-contact rates is due more to the influence of interviewers than to the influence of areas. The results from separate logistic regression models are compared with the results from multinomial models using a polytomous dependent variable (refusals, non-contacts and responses). Using the cross-classified multilevel approach allows us to estimate correlations between refusals and non-contacts, suggesting that interviewers who are good at reducing whole household refusals are also good at reducing whole household non-contacts.  相似文献   

11.
The paper investigates non-negative quadratic unbiased (NnQU) estimators of positive semi-definite quadratic forms, for use during the survey sampling of finite population values. It examines several different NnQU estimators of the variance of estimators of population total, under various sampling designs. It identifies an optimal quadratic unbiased estimator of the variance of the Horvitz-Thompson estimator of population total.  相似文献   

12.
Estimation of price indexes in the United States is generally based on complex rotating panel surveys. The sample for the Consumer Price Index, for example, is selected in three stages—geographic areas, establishments, and individual items—with 20% of the sample being replaced by rotation each year. At each period, a time series of data is available for use in estimation. This article examines how to best combine data for estimation of long-term and short-term changes and how to estimate the variances of the index estimators in the context of two-stage sampling. I extend the class of estimators, introduced by Valliant and Miller, of Laspeyres indexes formed using sample data collected from the current period back to a previous base period. Linearization estimators of variance for indexes of long-term and short-term change are derived. The theory is supported by an empirical simulation study using two-stage sampling of establishments and items from a population derived from U.S. Bureau of Labor Statistics data.  相似文献   

13.
Two-stage procedures are introduced to control the width and coverage (validity) of confidence intervals for the estimation of the mean, the between groups variance component and certain ratios of the variance components in one-way random effects models. The procedures use the pilot sample data to estimate an “optimal” group size and then proceed to determine the number of groups by a stopping rule. Such sampling plans give rise to unbalanced data, which are consequently analyzed by the harmonic mean method. Several asymptotic results concerning the proposed procedures are given along with simulation results to assess their performance in moderate sample size situations. The proposed procedures were found to effectively control the width and probability of coverage of the resulting confidence intervals in all cases and were also found to be robust in the presence of missing observations. From a practical point of view, the procedures are illustrated using a real data set and it is shown that the resulting unbalanced designs tend to require smaller sample sizes than is needed in a corresponding balanced design where the group size is arbitrarily pre-specified.  相似文献   

14.
In practical survey sampling, missing data are unavoidable due to nonresponse, rejected observations by editing, disclosure control, or outlier suppression. We propose a calibrated imputation approach so that valid point and variance estimates of the population (or domain) totals can be computed by the secondary users using simple complete‐sample formulae. This is especially helpful for variance estimation, which generally require additional information and tools that are unavailable to the secondary users. Our approach is natural for continuous variables, where the estimation may be either based on reweighting or imputation, including possibly their outlier‐robust extensions. We also propose a multivariate procedure to accommodate the estimation of the covariance matrix between estimated population totals, which facilitates variance estimation of the ratios or differences among the estimated totals. We illustrate the proposed approach using simulation data in supplementary materials that are available online.  相似文献   

15.
Non-sampling errors have many different sources, can occur at different stages of the survey and are a result of error contributions of different personnel or agencies Involved in executing the survey. In Interview based surveys,interviewer effectively influences both the response error as well as the response rate. In the present investigation an estimation procedure has been proposed to study the contribution of variance due to response errors as well as non-response errors introduced because of the interviewer.  相似文献   

16.
Abstract. Systematic sampling is frequently used in surveys, because of its ease of implementation and its design efficiency. An important drawback of systematic sampling, however, is that no direct estimator of the design variance is available. We describe a new estimator of the model‐based expectation of the design variance, under a non‐parametric model for the population. The non‐parametric model is sufficiently flexible that it can be expected to hold at least approximately in many situations with continuous auxiliary variables observed at the population level. We prove the model consistency of the estimator for both the anticipated variance and the design variance under a non‐parametric model with a univariate covariate. The broad applicability of the approach is demonstrated on a dataset from a forestry survey.  相似文献   

17.
Summary.  The number of people to select within selected households has significant consequences for the conduct and output of household surveys. The operational and data quality implications of this choice are carefully considered in many surveys, but the effect on statistical efficiency is not well understood. The usual approach is to select all people in each selected household, where operational and data quality concerns make this feasible. If not, one person is usually selected from each selected household. We find that this strategy is not always justified, and we develop intermediate designs between these two extremes. Current practices were developed when household survey field procedures needed to be simple and robust; however, more complex designs are now feasible owing to the increasing use of computer-assisted interviewing. We develop more flexible designs by optimizing survey cost, based on a simple cost model, subject to a required variance for an estimator of population total. The innovation lies in the fact that household sample sizes are small integers, which creates challenges in both design and estimation. The new methods are evaluated empirically by using census and health survey data, showing considerable improvement over existing methods in some cases.  相似文献   

18.
Abstract

Many researchers used auxiliary information together with survey variable to improve the efficiency of population parameters like mean, variance, total and proportion. Ratio and regression estimation are the most commonly used methods that utilized auxiliary information in different ways to get the maximum benefits in the form of high precision of the estimators. Thompson first introduced the concept of Adaptive cluster sampling, which is an appropriate technique for collecting the samples from rare and clustered populations. In this article, a generalized exponential type estimator is proposed and its properties have been studied for the estimation of rare and highly clustered population variance using single auxiliary information. A numerical study is carried out on a real and artificial population to judge the performance of the proposed estimator over the competing estimators. It is shown that the proposed generalized exponential type estimator is more efficient than the adaptive and non adaptive estimators under conventional sampling design.  相似文献   

19.
A sampling scheme for selection of a sample of two units with inclusion probability proportionalto size is suggested which provides a non–negative variance estimator of the variance of Horvitz–Thompson estimator. The suggested sampling scheme is shown to perform better than many of the existing unequal probability and inclusion probability proportional to size sampling Achemes for a number of natural populations.  相似文献   

20.
Unequal probability sampling is commonly used for sample selection. In the context of spatial sampling, the variables of interest often present a positive spatial correlation, so that it is intuitively relevant to select spatially balanced samples. In this article, we study the properties of pivotal sampling and propose an application to tesselation for spatial sampling. We also propose a simple conservative variance estimator. We show that the proposed sampling design is spatially well balanced, with good statistical properties and is computationally very efficient.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号