首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 937 毫秒
1.
We give a formal definition of a representative sample, but roughly speaking, it is a scaled‐down version of the population, capturing its characteristics. New methods for selecting representative probability samples in the presence of auxiliary variables are introduced. Representative samples are needed for multipurpose surveys, when several target variables are of interest. Such samples also enable estimation of parameters in subspaces and improved estimation of target variable distributions. We describe how two recently proposed sampling designs can be used to produce representative samples. Both designs use distance between population units when producing a sample. We propose a distance function that can calculate distances between units in general auxiliary spaces. We also propose a variance estimator for the commonly used Horvitz–Thompson estimator. Real data as well as illustrative examples show that representative samples are obtained and that the variance of the Horvitz–Thompson estimator is reduced compared with simple random sampling.  相似文献   

2.
In this paper, an extension of Horvitz–Thompson estimator used in adaptive cluster sampling to continuous universe is developed. Main new results are presented in theorems. The primary notions of discrete population are transferred to continuous population. First and second order inclusion probabilities for networks are delivered. Horvitz–Thompson estimator for adaptive cluster sampling in continuous universe is constructed. The unbiasedness of the estimator is proven. Variance and unbiased variance estimator are delivered. Finally, the theory is illustrated with an example.  相似文献   

3.
A balanced sampling design has the interesting property that Horvitz–Thompson estimators of totals for a set of balancing variables are equal to the totals we want to estimate, therefore the variance of Horvitz–Thompson estimators of variables of interest are reduced in function of their correlations with the balancing variables. Since it is hard to derive an analytic expression for the joint inclusion probabilities, we derive a general approximation of variance based on a residual technique. This approximation is useful even in the particular case of unequal probability sampling with fixed sample size. Finally, a set of numerical studies with an original methodology allows to validate this approximation.  相似文献   

4.
Two‐phase sampling is often used for estimating a population total or mean when the cost per unit of collecting auxiliary variables, x, is much smaller than the cost per unit of measuring a characteristic of interest, y. In the first phase, a large sample s1 is drawn according to a specific sampling design p(s1) , and auxiliary data x are observed for the units is1 . Given the first‐phase sample s1 , a second‐phase sample s2 is selected from s1 according to a specified sampling design {p(s2s1) } , and (y, x) is observed for the units is2 . In some cases, the population totals of some components of x may also be known. Two‐phase sampling is used for stratification at the second phase or both phases and for regression estimation. Horvitz–Thompson‐type variance estimators are used for variance estimation. However, the Horvitz–Thompson ( Horvitz & Thompson, J. Amer. Statist. Assoc. 1952 ) variance estimator in uni‐phase sampling is known to be highly unstable and may take negative values when the units are selected with unequal probabilities. On the other hand, the Sen–Yates–Grundy variance estimator is relatively stable and non‐negative for several unequal probability sampling designs with fixed sample sizes. In this paper, we extend the Sen–Yates–Grundy ( Sen , J. Ind. Soc. Agric. Statist. 1953; Yates & Grundy , J. Roy. Statist. Soc. Ser. B 1953) variance estimator to two‐phase sampling, assuming fixed first‐phase sample size and fixed second‐phase sample size given the first‐phase sample. We apply the new variance estimators to two‐phase sampling designs with stratification at the second phase or both phases. We also develop Sen–Yates–Grundy‐type variance estimators of the two‐phase regression estimators that make use of the first‐phase auxiliary data and known population totals of some of the auxiliary variables.  相似文献   

5.
For fixed size sampling designs with high entropy, it is well known that the variance of the Horvitz–Thompson estimator can be approximated by the Hájek formula. The interest of this asymptotic variance approximation is that it only involves the first order inclusion probabilities of the statistical units. We extend this variance formula when the variable under study is functional, and we prove, under general conditions on the regularity of the individual trajectories and the sampling design, that we can get a uniformly convergent estimator of the variance function of the Horvitz–Thompson estimator of the mean function. Rates of convergence to the true variance function are given for the rejective sampling. We deduce, under conditions on the entropy of the sampling design, that it is possible to build confidence bands whose coverage is asymptotically the desired one via simulation of Gaussian processes with variance function given by the Hájek formula. Finally, the accuracy of the proposed variance estimator is evaluated on samples of electricity consumption data measured every half an hour over a period of 1 week.  相似文献   

6.
A sampling scheme for selection of a sample of two units with inclusion probability proportionalto size is suggested which provides a non–negative variance estimator of the variance of Horvitz–Thompson estimator. The suggested sampling scheme is shown to perform better than many of the existing unequal probability and inclusion probability proportional to size sampling Achemes for a number of natural populations.  相似文献   

7.
With a growing interest in using non-representative samples to train prediction models for numerous outcomes it is necessary to account for the sampling design that gives rise to the data in order to assess the generalized predictive utility of a proposed prediction rule. After learning a prediction rule based on a non-uniform sample, it is of interest to estimate the rule's error rate when applied to unobserved members of the population. Efron (1986) proposed a general class of covariance penalty inflated prediction error estimators that assume the available training data are representative of the target population for which the prediction rule is to be applied. We extend Efron's estimator to the complex sample context by incorporating Horvitz–Thompson sampling weights and show that it is consistent for the true generalization error rate when applied to the underlying superpopulation. The resulting Horvitz–Thompson–Efron estimator is equivalent to dAIC, a recent extension of Akaike's information criteria to survey sampling data, but is more widely applicable. The proposed methodology is assessed with simulations and is applied to models predicting renal function obtained from the large-scale National Health and Nutrition Examination Study survey. The Canadian Journal of Statistics 48: 204–221; 2020 © 2019 Statistical Society of Canada  相似文献   

8.
A class of sampling two units without replacement with inclusion probability proportional to size is proposed in this article. Many different well known probability proportional to size sampling designs are special cases from this class. The first and second inclusion probabilities of this class satisfy important properties and provide a non-negative variance estimator of the Horvitz and Thompson estimator for the population total. Suitable choice for the first and second inclusion probabilities from this class can be used to reduce the variance estimator of the Horvitz and Thompson estimator. Comparisons between different proportional to size sampling designs through real data and artificial examples are given. Examples show that the minimum variance of the Horvitz and Thompson estimator obtained from the proposed design is not attainable for the most cases at any of the well known designs.  相似文献   

9.
The sample coordination problem involves maximization or minimization of overlap of sampling units in different/repeated surveys. Several optimal techniques using transportation theory, controlled rounding, and controlled selection have been suggested in literature to solve the sample coordination problem. In this article, using the multiple objective programming, we propose a method for sample coordination which facilitates variance estimation using the Horvitz–Thompson estimator. The proposed procedure can be applied to any two-sample surveys having identical universe and stratification. Some examples are discussed to demonstrate the utility of the proposed procedure.  相似文献   

10.
Simple location-shifts for the study or auxiliary character are proposed under Midzuno–Sen sampling from a finite population. These aim at improving the efficiency of the classical Horvitz–Thompson estimator or the unbiased ratio estimator of a population total. It is demonstrated that the choice of the translation parameters is flexible. A few methods for assessing these parameters are outlined. The gain in efficiency of estimation is illustrated.  相似文献   

11.
In stratified sampling, methods for the allocation of effort among strata usually rely on some measure of within-stratum variance. If we do not have enough information about these variances, adaptive allocation can be used. In adaptive allocation designs, surveys are conducted in two phases. Information from the first phase is used to allocate the remaining units among the strata in the second phase. Brown et al. [Adaptive two-stage sequential sampling, Popul. Ecol. 50 (2008), pp. 239–245] introduced an adaptive allocation sampling design – where the final sample size was random – and an unbiased estimator. Here, we derive an unbiased variance estimator for the design, and consider a related design where the final sample size is fixed. Having a fixed final sample size can make survey-planning easier. We introduce a biased Horvitz–Thompson type estimator and a biased sample mean type estimator for the sampling designs. We conduct two simulation studies on honey producers in Kurdistan and synthetic zirconium distribution in a region on the moon. Results show that the introduced estimators are more efficient than the available estimators for both variable and fixed sample size designs, and the conventional unbiased estimator of stratified simple random sampling design. In order to evaluate efficiencies of the introduced designs and their estimator furthermore, we first review some well-known adaptive allocation designs and compare their estimator with the introduced estimators. Simulation results show that the introduced estimators are more efficient than available estimators of these well-known adaptive allocation designs.  相似文献   

12.
We extend the problem of obtaining an estimator for the finite population mean parameter incorporating complete auxiliary information through calibration estimation in survey sampling under a functional data framework. The functional calibration sampling weights of the estimator are obtained by matching the calibration estimation problem with the maximum entropy on the mean – MEM – principle. In particular, the calibration estimation is viewed as an infinite-dimensional linear inverse problem following the structure of the MEM approach. We give a precise theoretical setting and estimate the functional calibration weights assuming, as prior measures, the centred Gaussian and compound Poisson random measures. Additionally, through a simple simulation study, we show that the proposed functional calibration estimator improves its accuracy compared with the Horvitz–Thompson one.  相似文献   

13.
Drawing distinct units without replacement and with unequal probabilities from a population is a problem often considered in the literature (e.g. Hanif and Brewer, 1980, Int. Statist. Rev. 48, 317–355). In such a case, the sample mean is a biased estimator of the population mean. For this reason, we use the unbiased Horvitz–Thompson estimator (1951). In this work, we focus our interest on the variance of this estimator. The variance is cumbersome to compute because it requires the calculation of a large number of second-order inclusion probabilities. It would be helpful to use an approximation that does not need heavy calculations. The Hájek (1964) variance approximation provides this advantage as it is free of second-order inclusion probabilities. Hájek (1964) proved that this approximation is valid under restrictive conditions that are usually not fulfilled in practice. In this paper, we give more general conditions and we show that this approximation remains acceptable for most practical problems.  相似文献   

14.
We show that the Hájek (Ann. Math Statist. (1964) 1491) variance estimator can be used to estimate the variance of the Horvitz–Thompson estimator when the Chao sampling scheme (Chao, Biometrika 69 (1982) 653) is implemented. This estimator is simple and can be implemented with any statistical packages. We consider a numerical and an analytic method to show that this estimator can be used. A series of simulations supports our findings.  相似文献   

15.
Sarjinder Singh 《Statistics》2013,47(3):566-574
In this note, a dual problem to the calibration of design weights of the Deville and Särndal [Calibration estimators in survey sampling, J. Amer. Statist. Assoc. 87 (1992), pp. 376–382] method has been considered. We conclude that the chi-squared distance between the design weights and the calibrated weights equals the square of the standardized Z-score obtained by the difference between the known population total of the auxiliary variable and its corresponding Horvitz and Thompson [A generalization of sampling without replacement from a finite universe, J. Amer. Statist. Assoc. 47 (1952), pp. 663–685] estimator divided by the sample standard deviation of the auxiliary variable to obtain the linear regression estimator in survey sampling.  相似文献   

16.
The authors consider semiparametric efficient estimation of parameters in the conditional mean model for a simple incomplete data structure in which the outcome of interest is observed only for a random subset of subjects but covariates and surrogate (auxiliary) outcomes are observed for all. They use optimal estimating function theory to derive the semiparametric efficient score in closed form. They show that when covariates and auxiliary outcomes are discrete, a Horvitz‐Thompson type estimator with empirically estimated weights is semiparametric efficient. The authors give simulation studies validating the finite‐sample behaviour of the semiparametric efficient estimator and its asymptotic variance; they demonstrate the efficiency of the estimator in realistic settings.  相似文献   

17.
At the design and estimation stage of a survey, large survey organization often uses auxiliary information. This article discusses various procedures for improving variance estimation of the Horvitz–Thompson estimator of a finite population total with the aid of auxiliary information. To study the design-based properties of the proposed variance estimators relative to the standard one, a small scale Monte Carlo study is performed.  相似文献   

18.
In a missing-data setting, we want to estimate the mean of a scalar outcome, based on a sample in which an explanatory variable is observed for every subject while responses are missing by happenstance for some of them. We consider two kinds of estimates of the mean response when the explanatory variable is functional. One is based on the average of the predicted values and the second one is a functional adaptation of the Horvitz–Thompson estimator. We show that the infinite dimensionality of the problem does not affect the rates of convergence by stating that the estimates are root-n consistent, under missing at random (MAR) assumption. These asymptotic features are completed by simulated experiments illustrating the easiness of implementation and the good behaviour on finite sample sizes of the method. This is the first paper emphasizing that the insensitiveness of averaged estimates, well known in multivariate non-parametric statistics, remains true for an infinite-dimensional covariable. In this sense, this work opens the way for various other results of this kind in functional data analysis.  相似文献   

19.
Variance estimation under systematic sampling with probability proportional to size is known to be a difficult problem. We attempt to tackle this problem by the bootstrap resampling method. It is shown that the usual way to bootstrap fails to give satisfactory variance estimates. As a remedy, we propose a double bootstrap method which is based on certain working models and involves two levels of resampling. Unlike existing methods which deal exclusively with the Horvitz–Thompson estimator, the double bootstrap method can be used to estimate the variance of any statistic. We illustrate this within the context of both mean and median estimation. Empirical results based on five natural populations are encouraging.  相似文献   

20.
巩红禹  陈雅 《统计研究》2018,35(12):113-122
本文主要讨论样本代表性的改进和多目标调查两个问题。一,本文提出了一种新的改进样本代表性多目标抽样方法,增加样本量与调整样本结构相结合的方法-追加样本的平衡设计,即通过追加样本,使得补充的样本与原来的样本组合生成新的平衡样本,相对于初始样本,减少样本与总体的结构性偏差。平衡样本是指辅助变量总量的霍维茨汤普森估计量等于总体总量真值。二,平衡样本通过选择与多个目标参数相关的辅助变量,使得一套样本对不同的目标参数而言都具有良好的代表性,进而完成多目标调查。结合2010年第六次人口分县普查数据,通过选择多个目标参数,对追加样本后的平衡样本作事后评估结果表明,追加平衡设计能够有效改进样本结构,使得样本结构与总体结构相近,降低目标估计的误差;同时也说明平衡抽样设计能够实现多目标调查,提高样本的使用效率。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号