期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An exact small sample theory for post-stratification

D.C. Doss H.O. Hartley G.R. Somayajulu 《Journal of statistical planning and inference》1979,3(3):235-247

A genuine small sample theory for post-stratification is developed in this paper. This includes the definition of a ratio estimator of the population mean ?, the derivation of its bias and its exact variance and a discussion of variance estimation. The estimator has both a within strata component of variance which is comparable with that obtained in proportional allocation stratified sampling and a between strata component of variance which will tend to zero as the overall sample size becomes large. Certain optimality properties of the estimator are obtained. The generalization of post-stratification from the simple random sampling to post-stratification used in conjunction with stratification and multi-stage designs is discussed. 相似文献

2.

Inverse Adaptive Cluster Sampling with Unequal Selection Probabilities: Case Studies on Crab Holes and Arsenic Pollution

下载免费PDF全文

Mohammad Salehi Mohammad Moradi Jassim A. Al Khayat Jennifer Brown Adil Eltayeb Mohamed Yousif 《Australian & New Zealand Journal of Statistics》2015,57(2):189-201

Adaptive cluster sampling is an efficient method of estimating the parameters of rare and clustered populations. The method mimics how biologists would like to collect data in the field by targeting survey effort to localised areas where the rare population occurs. Another popular sampling design is inverse sampling. Inverse sampling was developed so as to be able to obtain a sample of rare events having a predetermined size. Ideally, in inverse sampling, the resultant sample set will be sufficiently large to ensure reliable estimation of population parameters. In an effort to combine the good properties of these two designs, adaptive cluster sampling and inverse sampling, we introduce inverse adaptive cluster sampling with unequal selection probabilities. We develop an unbiased estimator of the population total that is applicable to data obtained from such designs. We also develop numerical approximations to this estimator. The efficiency of the estimators that we introduce is investigated through simulation studies based on two real populations: crabs in Al Khor, Qatar and arsenic pollution in Kurdistan, Iran. The simulation results show that our estimators are efficient. 相似文献

3.

On the Problem of Compromise Allocation in Multi-Response Stratified Sample Surveys

Saman Khowaja Shazia Ghufran M. J. Ahsan 《统计学通讯:模拟与计算》2013,42(4):790-799

In stratified sample surveys, the problem of determining the optimum allocation is well known due to articles published in 1923 by Tschuprow and in 1934 by Neyman. The articles suggest the optimum sample sizes to be selected from each stratum for which sampling variance of the estimator is minimum for fixed total cost of the survey or the cost is minimum for a fixed precision of the estimator. If in a sample survey more than one characteristic is to be measured on each selected unit of the sample, that is, the survey is a multi-response survey, then the problem of determining the optimum sample sizes to various strata becomes more complex because of the non-availability of a single optimality criterion that suits all the characteristics. Many authors discussed compromise criterion that provides a compromise allocation, which is optimum for all characteristics, at least in some sense. Almost all of these authors worked out the compromise allocation by minimizing some function of the sampling variances of the estimators under a single cost constraint. A serious objection to this approach is that the variances are not unit free so that minimizing any function of variances may not be an appropriate objective to obtain a compromise allocation. This fact suggests the use of coefficient of variations instead of variances. In the present article, the problem of compromise allocation is formulated as a multi-objective non-linear programming problem. By linearizing the non-linear objective functions at their individual optima, the problem is approximated to an integer linear programming problem. Goal programming technique is then used to obtain a solution to the approximated problem. 相似文献

4.

INVERSE SAMPLING FOR DOMAIN ESTIMATION IN A STRATIFIED POPULATION

Rahul Mukerjee Sujit K. Basu 《Australian & New Zealand Journal of Statistics》1993,35(3):293-302

For a stratified population under inverse sampling, we propose and study an unbiased estimator for the mean of units belonging to a domain with specific features. An alternative, simpler, ratio-type estimator is also considered. Empirical studies show that strategies based on inverse sampling can be superior to a more traditional strategy based on stratified simple random sampling with a fixed number of draws in each stratum. 相似文献

5.

规模以下工业抽样调查中代表性样本的一种探索设计:平衡抽样设计

巩红禹《统计与信息论坛》2017,(4):8-15

规下工业抽样调查是社会经济统计调查的重要组成部分,为国民经济核算提供基础数据,而样本代表性直接决定统计推断结果。对企业目录库抽取平衡样本,能够使得样本结构与总体结构相似。平衡样本是指满足如下条件的样本:辅助变量的汉森赫维茨估计等于总体总量真值。平衡抽样设计需要包含丰富辅助信息的完善抽样框,政府统计数据能够为此提供足够的支撑。基于2009年工业企业数据库的实证分析表明,平衡抽样设计对总体总量的估计相对误差很小,特别是估计的均值与总体真值非常接近,近似无偏;与简单随机抽样比较,平衡抽样设计更加有效。相似文献

6.

Gap-based inverse sampling

Bardia Panahbehagh Jennifer Brown 《统计学通讯:理论与方法》2017,46(19):9651-9661

We present a new inverse sampling design for surveys of rare events, Gap-Based Inverse Sampling. In the design, sampling stops if after a predetermined interval, or gap, no new rare events are found. The length of the gap that follows after finding a rare event is used as a way of limiting sample effort. We present stopping rules using decisions based on the gap length, the total number of rare events found, and a fixed upper limit of survey effort. We illustrate the use of the design with stratified sampling of two biological populations. The design uses the intuitive behavior of a field biologist in stratified sampling, where if in a stratum nothing is found after a long search, the field surveyor would like to consider the stratum is empty and stop searching. Our design has appeal for surveying rare events (for example, a rare species) with stratified sampling where there are likely to be some completely empty strata. 相似文献

7.

A more efficient mean estimator for judgement post-stratification

《Journal of Statistical Computation and Simulation》2012,82(7):1404-1414

Meeden and Lee [More efficient inferences using ranking information obtained from judgment sampling. J Surv Stat Methodol. 2014;2:38–57] recently showed that one can improve upon the standard unbiased mean estimator for judgement post-stratification (JPS) by using the ordering information in the sample. We propose an alternate mean estimator that uses this same information. This alternate estimator is far simpler to compute than the estimator of Meeden and Lee (2014), and we show through simulations that it typically outperforms the Meeden and Lee (2014) estimator in cases where the rankings are sufficiently good that JPS is useful. 相似文献

8.

Spatially Balanced Sampling of Continuous Populations

《Scandinavian Journal of Statistics》2018,45(3):792-805

When sampling from a continuous population (or distribution), we often want a rather small sample due to some cost attached to processing the sample or to collecting information in the field. Moreover, a probability sample that allows for design‐based statistical inference is often desired. Given these requirements, we want to reduce the sampling variance of the Horvitz–Thompson estimator as much as possible. To achieve this, we introduce different approaches to using the local pivotal method for selecting well‐spread samples from multidimensional continuous populations. The results of a simulation study clearly indicate that we succeed in selecting spatially balanced samples and improve the efficiency of the Horvitz–Thompson estimator. 相似文献

9.

Performance of Interval Estimators for the Inverse Hypergeometric Distribution

Lei Zhang Wenting Xie William D. Johnson 《统计学通讯:模拟与计算》2015,44(5):1300-1310

The inverse hypergeometric distribution is of interest in applications of inverse sampling without replacement from a finite population where a binary observation is made on each sampling unit. Thus, sampling is performed by randomly choosing units sequentially one at a time until a specified number of one of the two types is selected for the sample. Assuming the total number of units in the population is known but the number of each type is not, we consider the problem of estimating this parameter. We use the Delta method to develop approximations for the variance of three parameter estimators. We then propose three large sample confidence intervals for the parameter. Based on these results, we selected a sampling of parameter values for the inverse hypergeometric distribution to empirically investigate performance of these estimators. We evaluate their performance in terms of expected probability of parameter coverage and confidence interval length calculated as means of possible outcomes weighted by the appropriate outcome probabilities for each parameter value considered. The unbiased estimator of the parameter is the preferred estimator relative to the maximum likelihood estimator and an estimator based on a negative binomial approximation, as evidenced by empirical estimates of closeness to the true parameter value. Confidence intervals based on the unbiased estimator tend to be shorter than the two competitors because of its relatively small variance but at a slight cost in terms of coverage probability. 相似文献

10.

On inclusion probabilities and relative estimator bias for Pareto πps sampling

《Journal of statistical planning and inference》2005,128(2):543-567

A means for utilizing auxiliary information in surveys is to sample with inclusion probabilities proportional to given size values, to use a πps design, preferably with fixed sample size. A novel candidate in that context is Pareto πps. This sampling scheme was derived by limit considerations and it works with a degree of approximation for finite samples. Desired and factual inclusion probabilities do not agree exactly, which in turn leads to some estimator bias. The central topic in this paper is to derive conditions for the bias to be negligible.Practically useful information on small sample behavior of Pareto πps can, to the best of our understanding, be gained only by numerical studies. Earlier investigations to that end have been too limited to allow general conclusions, while this paper reports on findings from an extensive numerical study. The chief conclusion is that the estimator bias is negligible in almost all situations met in survey practice. 相似文献

11.

Improving survey-weighted least squares regression

Lonnie Magee 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1998,60(1):115-126

The weighted least squares (WLS) estimator is often employed in linear regression using complex survey data to deal with the bias in ordinary least squares (OLS) arising from informative sampling. In this paper a 'quasi-Aitken WLS' (QWLS) estimator is proposed. QWLS modifies WLS in the same way that Cragg's quasi-Aitken estimator modifies OLS. It weights by the usual inverse sample inclusion probability weights multiplied by a parameterized function of covariates, where the parameters are chosen to minimize a variance criterion. The resulting estimator is consistent for the superpopulation regression coefficient under fairly mild conditions and has a smaller asymptotic variance than WLS. 相似文献

12.

小微企业抽样调查的样本轮换

万舒晨金勇进《统计与信息论坛》2016,(11):14-19

目前,小微企业抽样调查数据受到各级政府和社会各界的高度关注。针对小微企业单位新增、消亡变动频繁的特点,研究了总体单位及样本量变动的一般条件下的样本轮换理论,对样本轮换率和估计量进行了探讨,扩大了研究结果的适用范围,得到了简单随机抽样、分层抽样中样本轮换的有关结论,对估计量的抽样误差进行了有效控制,并进行了相关实证研究,最后提出了构造适合小微企业连续性抽样调查的样本轮换设计模式和方法。相似文献

13.

住户调查中代表性样本的一种探索获取方法——平衡抽样设计

巩红禹金勇进《统计研究》2015,32(9):84-90

住户调查是我国社会经济统计调查体系的重要组成部分,样本代表性直接决定统计数据质量。多阶段抽样中初级单元的方差对估计的影响是主要的,因此本文结合2010年全国第六次人口普查分县数据,采用平衡抽样设计获取初级单元的代表性样本-平衡样本。对代表性样本的事后评估结果表明,样本结构与总体结构吻合,目标估计的误差很小,说明了本文平衡设计的有效性。相似文献

14.

规模以下工业抽样设计研究

万舒晨《统计研究》2021,38(6):116-127

为推动规模以下工业抽样调查工作以及解决当前调查面临的有关问题,本文对抽样设计进行了改进研究。首先,本文对规模以下工业抽样设计演变过程进行系统梳理,总结了现行抽样设计充分利用双重抽样框设计和综合运用三种抽样方法的特点。其次,针对园区层企业密度高的特点,探索结合园区因素改进地域抽样设计,对园区层和非园区层分别抽样,解决调查中面临的非抽样误差问题,并调整辅助变量使其与核心指标相关性均较高,确保抽样推断精度,有效提高抽样调查效率。并以我国东部某省为例进行实证模拟得到结合园区因素抽样设计对调查工作改进的结论。再次,针对我国各级政府管理需要以及局队业务分工优化调整情况,介绍了规模以下工业样本追加理论和实证应用的主要研究成果。最后,在大数据时代数据来源广泛的背景下,本文在多重抽样框设计以及利用辅助变量提升样本轮换推断精度方面提出了进一步改进抽样设计的思路。相似文献

15.

网络访问固定样本调查的统计推断研究

刘展金勇进《统计与信息论坛》2017,(2):3-10

如何解决网络访问固定样本调查的统计推断问题,是大数据背景下网络调查面临的严重挑战。针对此问题,提出将网络访问固定样本的调查样本与概率样本结合,利用倾向得分逆加权和加权组调整构造伪权数来估计目标总体,进一步采用基于有放回概率抽样的Vwr方法、基于广义回归估计的Vgreg方法与Jackknife方法来估计方差,并比较不同方法估计的效果。研究表明:无论概率样本的样本量较大还是较小,本研究所提出的总体均值估计方法效果较好,并且在方差估计中Jackknife方法的估计效果最好。相似文献

16.

Multiple inverse sampling in post-stratification with subpopulation sizes unknown: a solution for quota sampling

《Journal of statistical planning and inference》2005,131(2):379-392

We extend traditional inverse sampling to multiple case. We then modify the multiple inverse sampling design to a version with taking a simple random sample at the beginning similar to Chang et al (J. Statist. Plan. Inference 69 (1998) 209) and a truncated version similar to Chang et al (J. Statist. Plan. Inference 76 (1999) 215). Using Murthy (Sankhya 18 (1957) 379) we develop their unbiased estimators and their unbiased variance estimators. These unbiased estimators can also be applied to a frequently used sampling scheme called quota sampling by practitioners. The multiple inverse sampling may be viewed as an improved version of quota sampling in some sense. We show that our estimators for estimating the proportions (weights) of subpopulations are more efficient and robust than available estimators using a small simulation study. 相似文献

17.

Post-stratification based direct adjustment approach to a missing data problem in clinical trials

《Journal of statistical planning and inference》2001,96(1):247-262

In clinical trials we always expect some missing data. If data are missing completely at random, then missing data can be ignored for the purpose of statistical inference. In most situation, however, ignoring missing data will introduce bias. Adjustment is possible for missing data if the missing mechanism is known, which is rare in real problems. Our approach is to estimate directly the mean outcome of each treatment group in the presence of missing data. To this end, we post-stratify all the subjects by the expected value of outcome (or by a variable predictive of the outcome) so that subjects within a stratum may be considered homogeneous with respect to the expected outcome, and assume that subjects within a stratum are missing at random. We apply this post-stratification approach to a recently concluded clinical trial where a high proportion of data are missing and the missingness depends on the same factors affecting the outcome variable. A simulation study shows that the post-stratification approach reduces the bias substantially compared to the naive approach where only non-missing subjects are analyzed. 相似文献

18.

Confidence intervals for the closed population size under inverse sampling without replacement

Mohammad Mohammadi 《统计学通讯:理论与方法》2019,48(14):3518-3529

Inverse sampling is an appropriate design for the second phase of capture-recapture experiments which provides an exactly unbiased estimator of the population size. However, the sampling distribution of the resulting estimator tends to be highly right skewed for small recapture samples, so, the traditional Wald-type confidence intervals appear to be inappropriate. The objective of this paper is to study the performance of interval estimators for the population size under inverse recapture sampling without replacement. To this aim, we consider the Wald-type, the logarithmic transformation-based, the Wilson score, the likelihood ratio and the exact methods. Also, we propose some bootstrap confidence intervals for the population size, including the with-replacement bootstrap (BWR), the without replacement bootstrap (BWO), and the Rao–Wu’s rescaling method. A Monte Carlo simulation is employed to evaluate the performance of suggested methods in terms of the coverage probability, error rates and standardized average length. Our results show that the likelihood ratio and exact confidence intervals are preferred to other competitors, having the coverage probabilities close to the desired nominal level for any sample size, with more balanced error rate for exact method and shorter length for likelihood ratio method. It is notable that the BWO and Rao–Wu’s rescaling methods also may provide good intervals for some situations, however, those coverage probabilities are not invariant with respect to the population arguments, so one must be careful to use them. 相似文献

19.

Sampling design proportional to order statistic of auxiliary variable

Janusz L. Wywiał 《Statistical Papers》2008,49(2):277-289

The sampling designs dependent on sample moments of auxiliary variables are well known. Lahiri (Bull Int Stat Inst 33:133–140, 1951) considered a sampling design proportionate to a sample mean of an auxiliary variable. Sing and Srivastava (Biometrika 67(1):205–209, 1980) proposed the sampling design proportionate to a sample variance while Wywiał (J Indian Stat Assoc 37:73–87, 1999) a sampling design proportionate to a sample generalized variance of auxiliary variables. Some other sampling designs dependent on moments of an auxiliary variable were considered e.g. in Wywiał (Some contributions to multivariate methods in, survey sampling. Katowice University of Economics, Katowice, 2003a); Stat Transit 4(5):779–798, 2000) where accuracy of some sampling strategies were compared, too.These sampling designs cannot be useful in the case when there are some censored observations of the auxiliary variable. Moreover, they can be much too sensitive to outliers observations. In these cases the sampling design proportionate to the order statistic of an auxiliary variable can be more useful. That is why such an unequal probability sampling design is proposed here. Its particular cases as well as its conditional version are considered, too. The sampling scheme implementing this sampling design is proposed. The inclusion probabilities of the first and second orders were evaluated. The well known Horvitz–Thompson estimator is taken into account. A ratio estimator dependent on an order statistic is constructed. It is similar to the well known ratio estimator based on the population and sample means. Moreover, it is an unbiased estimator of the population mean when the sample is drawn according to the proposed sampling design dependent on the appropriate order statistic. 相似文献

20.

Unbiased Estimation of the Distribution Function of an Exponential Population Using Order Statistics with Application in Ranked Set Sampling

Bikas K. Sinha Sujay Mukhuti 《统计学通讯:理论与方法》2013,42(9):1655-1670

In this paper we consider the problem of unbiased estimation of the distribution function of an exponential population using order statistics based on a random sample. We present a (unique) unbiased estimator based on a single, say ith, order statistic and study some properties of the estimator for i = 2. We also indicate how this estimator can be utilized to obtain unbiased estimators when a few selected order statistics are available as well as when the sample is selected following an alternative sampling procedure known as ranked set sampling. It is further proved that for a ranked set sample of size two, the proposed estimator is uniformly better than the conventional nonparametric unbiased estimator, further, for a general sample size, a modified ranked set sampling procedure provides an unbiased estimator uniformly better than the conventional nonparametric unbiased estimator based on the usual ranked set sampling procedure. 相似文献