期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Neyman-type sample allocation for domains-efficient estimation in multistage sampling

Khan M. G. M. Wesołowski Jacek 《AStA Advances in Statistical Analysis》2019,103(4):563-592

We consider a problem of allocation of a sample in two- and three-stage sampling. We seek allocation which is both multi-domain and population efficient. Choudhry et al. (Survey Methods 38(1):23–29, 2012) recently considered such problem for one-stage stratified simple random sampling without replacement in domains. Their approach was through minimization of the sample size under constraints on relative variances in all domains and on the overall relative variance. To attain this goal, they used nonlinear programming. Alternatively, we minimize here the relative variances in all domains (controlling them through given priority weights) as well as the overall relative variance under constraints imposed on total (expected) cost. We consider several two- and three-stage sampling schemes. Our aim is to shed some light on the analytic structure of solutions rather than in deriving a purely numerical tool for sample allocation. To this end, we develop the eigenproblem methodology introduced in optimal allocation problems in Niemiro and Wesołowski (Appl Math 28:73–82, 2001) and recently updated in Wesołowski and Wieczorkowski (Commun Stat Theory Methods 46(5):2212–2231, 2017) by taking under account several new sampling schemes and, more importantly, by the (single) total expected variable cost constraint. Such approach allows for solutions which are direct generalization of the Neyman-type allocation. The structure of the solution is deciphered from the explicit allocation formulas given in terms of an eigenvector ${\underline{v}}^*$ of a population-based matrix $\mathbf{D}$. The solution we provide can be viewed as a multi-domain version of the Neyman-type allocation in multistage stratified SRSWOR schemes.

相似文献

2.

An Optimal Multivariate Stratified Sampling Design Using Dynamic Programming 总被引：2，自引：0，他引：2

M.G.M. Khan E.A. Khan & M.J. Ahsan 《Australian & New Zealand Journal of Statistics》2003,45(1):107-113

Numerous optimization problems arise in survey designs. The problem of obtaining an optimal (or near optimal) sampling design can be formulated and solved as a mathematical programming problem. In multivariate stratified sample surveys usually it is not possible to use the individual optimum allocations for sample sizes to various strata for one reason or another. In such situations some criterion is needed to work out an allocation which is optimum for all characteristics in some sense. Such an allocation may be called an optimum compromise allocation. This paper examines the problem of determining an optimum compromise allocation in multivariate stratified random sampling, when the population means of several characteristics are to be estimated. Formulating the problem of allocation as an all integer nonlinear programming problem, the paper develops a solution procedure using a dynamic programming technique. The compromise allocation discussed is optimal in the sense that it minimizes a weighted sum of the sampling variances of the estimates of the population means of various characteristics under study. A numerical example illustrates the solution procedure and shows how it compares with Cochran's average allocation and proportional allocation. 相似文献

3.

Geometric programming for optimal allocation of integrated samples in quality control

Miles Davis Rudolph E. Schwartz 《统计学通讯:理论与方法》2013,42(11):3235-3254

We apply geometric programming, developed by Duffin, Peterson Zener (1967), to the optimal allocation of stratified samples. As an introduction, we show how geometric programming is used to allocate samples according to Neyman (1934), using the data of Cornell (1947) and following the exposition of Cochran (1953).

Then we use geometric programming to allocate an integrated sample introduced by Schwartz (1978) for more efficient sampling of three U. S. Federal welfare quality control systems, Aid to Families with Dependent Children, Food Stamps and Medicaid.

We develop methods for setting up the allocation problem, interpreting it as a geometric programming primal problem, transforming it to the corresponding dual problem, solving that, and finding the sample sizes required in the allocation problem. We show that the integrated sample saves sampling costs. 相似文献

4.

Optimal allocation of stratified samples with several variance constraints and equal workloads over time by geometric programming

Miles Davis Robert H. Finch Jr. 《统计学通讯:理论与方法》2013,42(4):1507-1520

We apply geometric programming, developed by Duffin, Peterson and Zener (1967), to the optimal allocation of stratified samples with several variance constraints arising from several estimates of deficiency rates in the quality control of administrative decisions. We develop also a method for imposing constraints on sample sizes to equalize workloads over time, as required by the practicalities of clerical work for quality control.

We allocate samples by an extension of the work of Neyman (1934), following the exposition of Cochran (1977). Davis and Schwartz (1987) developed methods for multiconstraint Neyman allocation by geometric programming for integrated sampling. They also applied geometric programming to Neyman allocation of a sample for estimating college enrollments by Cornell (1947) and Cochran (1977). This paper continues the application of geometric programming to Neyman allocation with multiple constraints on variances and workloads and minimpal sampling costs. 相似文献

5.

On the Problem of Compromise Allocation in Multi-Response Stratified Sample Surveys

Saman Khowaja Shazia Ghufran M. J. Ahsan 《统计学通讯:模拟与计算》2013,42(4):790-799

In stratified sample surveys, the problem of determining the optimum allocation is well known due to articles published in 1923 by Tschuprow and in 1934 by Neyman. The articles suggest the optimum sample sizes to be selected from each stratum for which sampling variance of the estimator is minimum for fixed total cost of the survey or the cost is minimum for a fixed precision of the estimator. If in a sample survey more than one characteristic is to be measured on each selected unit of the sample, that is, the survey is a multi-response survey, then the problem of determining the optimum sample sizes to various strata becomes more complex because of the non-availability of a single optimality criterion that suits all the characteristics. Many authors discussed compromise criterion that provides a compromise allocation, which is optimum for all characteristics, at least in some sense. Almost all of these authors worked out the compromise allocation by minimizing some function of the sampling variances of the estimators under a single cost constraint. A serious objection to this approach is that the variances are not unit free so that minimizing any function of variances may not be an appropriate objective to obtain a compromise allocation. This fact suggests the use of coefficient of variations instead of variances. In the present article, the problem of compromise allocation is formulated as a multi-objective non-linear programming problem. By linearizing the non-linear objective functions at their individual optima, the problem is approximated to an integer linear programming problem. Goal programming technique is then used to obtain a solution to the approximated problem. 相似文献

6.

Properties of the bridge sampler with a focus on splitting the MCMC sample

Jackie S. T. Wong Jonathan J. Forster Peter W. F. Smith 《Statistics and Computing》2020,30(4):799-816

Computation of normalizing constants is a fundamental mathematical problem in various disciplines, particularly in Bayesian model selection problems. A sampling-based technique known as bridge sampling (Meng and Wong in Stat Sin 6(4):831–860, 1996) has been found to produce accurate estimates of normalizing constants and is shown to possess good asymptotic properties. For small to moderate sample sizes (as in situations with limited computational resources), we demonstrate that the (optimal) bridge sampler produces biased estimates. Specifically, when one density (we denote as $$p_2$$) is constructed to be close to the target density (we denote as $$p_1$$) using method of moments, our simulation-based results indicate that the correlation-induced bias through the moment-matching procedure is non-negligible. More crucially, the bias amplifies as the dimensionality of the problem increases. Thus, a series of theoretical as well as empirical investigations is carried out to identify the nature and origin of the bias. We then examine the effect of sample size allocation on the accuracy of bridge sampling estimates and discovered that one possibility of reducing both the bias and standard error with a small increase in computational effort is by drawing extra samples from the moment-matched density $$p_2$$ (which we assume easy to sample from), provided that the evaluation of $$p_1$$ is not too expensive. We proceed to show how the simple adaptive approach we termed “splitting” manages to alleviate the correlation-induced bias at the expense of a higher standard error, irrespective of the dimensionality involved. We also slightly modified the strategy suggested by Wang et al. (Warp bridge sampling: the next generation, Preprint, 2019. arXiv:1609.07690) to address the issue of the increase in standard error due to splitting, which is later generalized to further improve the efficiency. We conclude the paper by offering our insights of the application of a combination of these adaptive methods to improve the accuracy of bridge sampling estimates in Bayesian applications (where posterior samples are typically expensive to generate) based on the preceding investigations, with an application to a practical example. 相似文献

7.

On an ethical cum optimal adaptive allocation design

Uttam Bandyopadhyay Rahul Bhattacharya 《Statistics》2013,47(6):471-483

In the present work, we formulate a two-treatment single period two-stage adaptive allocation design for achieving larger allocation proportion to the better treatment arm in the course of the trial with increased precision of the parameter estimator. We examine some properties of the proposed rule and compare it with some of the existing allocation rules and report substantial gain in efficiency with a considerably larger number of allocations to the better treatment even for moderate sample sizes. 相似文献

8.

Optimal sample size planning for the Wilcoxon–Mann–Whitney and van Elteren tests under cost constraints

Jiin-Huarng Guo 《Journal of applied statistics》2012,39(10):2153-2164

Sampling cost is a crucial factor in sample size planning, particularly when the treatment group is more expensive than the control group. To either minimize the total cost or maximize the statistical power of the test, we used the distribution-free Wilcoxon–Mann–Whitney test for two independent samples and the van Elteren test for randomized block design, respectively. We then developed approximate sample size formulas when the distribution of data is abnormal and/or unknown. This study derived the optimal sample size allocation ratio for a given statistical power by considering the cost constraints, so that the resulting sample sizes could minimize either the total cost or the total sample size. Moreover, for a given total cost, the optimal sample size allocation is recommended to maximize the statistical power of the test. The proposed formula is not only innovative, but also quick and easy. We also applied real data from a clinical trial to illustrate how to choose the sample size for a randomized two-block design. For nonparametric methods, no existing commercial software for sample size planning has considered the cost factor, and therefore the proposed methods can provide important insights related to the impact of cost constraints. 相似文献

9.

Estimating Process Capability Index C PM Using a Bootstrap Sequential Sampling Procedure

L. Sandamali Dharmasena P. Castagliola 《统计学通讯:模拟与计算》2013,42(6):1097-1110

Construction of a confidence interval for process capability index C _PM is often based on a normal approximation with fixed sample size. In this article, we describe a different approach in constructing a fixed-width confidence interval for process capability index C _PM with a preassigned accuracy by using a combination of bootstrap and sequential sampling schemes. The optimal sample size required to achieve a preassigned confidence level is obtained using both two-stage and modified two-stage sequential procedures. The procedure developed is also validated using an extensive simulation study. 相似文献

10.

Designing of a mixed-chain sampling plan based on the process capability index with chain sampling as the attributes plan

M. Usha 《统计学通讯:理论与方法》2017,46(21):10456-10475

In this article, we propose a new mixed chain sampling plan based on the process capability index C_pk, where the quality characteristic of interest having double specification limits and follows the normal distribution with unknown mean and variance. In the proposed mixed plan, the chain sampling inspection plan is used for the inspection of attribute quality characteristics. The advantages of this proposed mixed sampling plan are also discussed. Tables are constructed to determine the optimal parameters for practical applications by formulating the problem as a non linear programming in which the objective function to be minimized is the average sample number and the constraints are related to lot acceptance probabilities at acceptable quality level and limiting quality level under the operating characteristic curve. The practical application of the proposed mixed sampling plan is explained with an illustrative example. Comparison of the proposed sampling plan is also made with other existing sampling plans. 相似文献

11.

Two-stage informative cluster sampling—estimation and prediction with applications for small-area models

Abdulhakeem Eideh Gad Nathan 《Journal of statistical planning and inference》2009

This paper considers the effects of informative two-stage cluster sampling on estimation and prediction. The aims of this article are twofold: first to estimate the parameters of the superpopulation model for two-stage cluster sampling from a finite population, when the sampling design for both stages is informative, using maximum likelihood estimation methods based on the sample-likelihood function; secondly to predict the finite population total and to predict the cluster-specific effects and the cluster totals for clusters in the sample and for clusters not in the sample. To achieve this we derive the sample and sample-complement distributions and the moments of the first and second stage measurements. Also we derive the conditional sample and conditional sample-complement distributions and the moments of the cluster-specific effects given the cluster measurements. It should be noted that classical design-based inference that consists of weighting the sample observations by the inverse of sample selection probabilities cannot be applied for the prediction of the cluster-specific effects for clusters not in the sample. Also we give an alternative justification of the Royall [1976. The linear least squares prediction approach to two-stage sampling. Journal of the American Statistical Association 71, 657–664] predictor of the finite population total under two-stage cluster population. Furthermore, small-area models are studied under informative sampling. 相似文献

12.

Multiple-start balanced modified systematic sampling in the presence of linear trend

L. R. Naidoo D. North T. Zewotir R. Arnab 《统计学通讯:理论与方法》2013,42(14):4307-4324

ABSTRACT

In this paper, we propose a sampling design termed as multiple-start balanced modified systematic sampling (MBMSS), which involves the supplementation of two or more balanced modified systematic samples, thus permitting us to obtain an unbiased estimate of the associated sampling variance. There are five cases for this design and in the presence of linear trend only one of these cases is optimal. To further improve results for the other cases, we propose an estimator that removes linear trend by applying weights to the first and last sampling units of the selected balanced modified systematic samples and is thus termed as the MBMSS with end corrections (MBMSSEC) estimator. By assuming a linear trend model averaged over a super-population model, we will compare the expected mean square errors (MSEs) of the proposed sample means, to that of simple random sampling (SRS), linear systematic sampling (LSS), stratified random sampling (STR), multiple-start linear systematic sampling (MLSS), and other modified MLSS estimators. As a result, MBMSS is optimal for one of the five possible cases, while the MBMSSEC estimator is preferred for three of the other four cases. 相似文献

13.

On triple sampling scremes for estimating from binomial data with misclassification errors ∗

Yosef Hochbeg Aaron Tenenbein 《统计学通讯:理论与方法》2013,42(13):1523-1533

Previous work has been carried out on the use of double sampling schemes for inference from binomial data which are subject to misclassification. The double sampling scheme utilizes a sample of n units which are classified by both a fallible and a true device and another sample of n₂ units which are classified only by a fallible device. A triple sampljng scheme incorporates an additional sample of n_l units which are classified only by the true device. In this paper we apply this triple sampling to estimation from binomialdata. First estimation of a binomial proportion is discussed under different misclassification structures. Then, the problem of optimal allocation of sample sizes is discussed. 相似文献

14.

Efficient estimators for adaptive stratified sequential sampling

《Journal of Statistical Computation and Simulation》2012,82(10):1163-1179

In stratified sampling, methods for the allocation of effort among strata usually rely on some measure of within-stratum variance. If we do not have enough information about these variances, adaptive allocation can be used. In adaptive allocation designs, surveys are conducted in two phases. Information from the first phase is used to allocate the remaining units among the strata in the second phase. Brown et al. [Adaptive two-stage sequential sampling, Popul. Ecol. 50 (2008), pp. 239–245] introduced an adaptive allocation sampling design – where the final sample size was random – and an unbiased estimator. Here, we derive an unbiased variance estimator for the design, and consider a related design where the final sample size is fixed. Having a fixed final sample size can make survey-planning easier. We introduce a biased Horvitz–Thompson type estimator and a biased sample mean type estimator for the sampling designs. We conduct two simulation studies on honey producers in Kurdistan and synthetic zirconium distribution in a region on the moon. Results show that the introduced estimators are more efficient than the available estimators for both variable and fixed sample size designs, and the conventional unbiased estimator of stratified simple random sampling design. In order to evaluate efficiencies of the introduced designs and their estimator furthermore, we first review some well-known adaptive allocation designs and compare their estimator with the introduced estimators. Simulation results show that the introduced estimators are more efficient than available estimators of these well-known adaptive allocation designs. 相似文献

15.

Hypothesis Testing in Two-Stage Cluster Sampling 总被引：1，自引：0，他引：1

Sumalee Givaruangsawat Govinda J. Weerakkody & Patrick D. Gerard 《Australian & New Zealand Journal of Statistics》1998,40(3):335-344

Correlated observations often arise in complex sampling schemes such as two-stage cluster sampling. The resulting observations from this sampling scheme usually exhibit certain positive intracluster correlation, as a result of which the standard statistical procedures for testing hypotheses concerning linear combinations of the parameters may lack some of the optimal properties that these possess when the data are uncorrelated. The aim of this paper is to present exact methods for testing these hypotheses by combining within and between cluster information much as in Zhou & Mathew (1993). 相似文献

16.

Multiple inverse sampling in post-stratification

《Journal of statistical planning and inference》1998,69(2):209-227

In sample survey, post-stratification is often used when the identification of stratum cannot be achieved in advance of the survey. If the sample size is large, post-stratification is usually as effective as the ordinary stratification with proportional allocation. However, in the case of small samples, no general acceptable theory or technique has been well developed. One of the main difficulties is the possibility of obtaining zero sample sizes in some strata for small samples. In this paper, we overcome this difficulty by employing a sampling scheme referred to as the multiple inverse sampling such that each stratum is ensured to be sampled a specified number of observations. A Monte Carlo simulation is carried out to compare the estimator obtained from the multiple inverse sampling with some other existing estimators. The estimator under multiple inverse sampling is superior in the sense that it is unbiased and its variance does not depend on the values of stratum means in the population. 相似文献

17.

Predicting random effects with an expanded finite population mixed model

Edward J. Stanek III Julio M. Singer 《Journal of statistical planning and inference》2008

Prediction of random effects is an important problem with expanding applications. In the simplest context, the problem corresponds to prediction of the latent value (the mean) of a realized cluster selected via two-stage sampling. Recently, Stanek and Singer [Predicting random effects from finite population clustered samples with response error. J. Amer. Statist. Assoc. 99, 119–130] developed best linear unbiased predictors (BLUP) under a finite population mixed model that outperform BLUPs from mixed models and superpopulation models. Their setup, however, does not allow for unequally sized clusters. To overcome this drawback, we consider an expanded finite population mixed model based on a larger set of random variables that span a higher dimensional space than those typically applied to such problems. We show that BLUPs for linear combinations of the realized cluster means derived under such a model have considerably smaller mean squared error (MSE) than those obtained from mixed models, superpopulation models, and finite population mixed models. We motivate our general approach by an example developed for two-stage cluster sampling and show that it faithfully captures the stochastic aspects of sampling in the problem. We also consider simulation studies to illustrate the increased accuracy of the BLUP obtained under the expanded finite population mixed model. 相似文献

18.

A nonparametric estimator of the number of classes based on a stratified random sample

James L. Norris III Duane A. Meeter 《统计学通讯:理论与方法》2013,42(9):2561-2580

Under stratified random sampling, we develop a k^th-order bootstrap bias-corrected estimator of the number of classes θ which exist in a study region. This research extends Smith and van Belle’s (1984) first-order bootstrap bias-corrected estimator under simple random sampling. Our estimator has applicability for many settings including: estimating the number of animals when there are stratified capture periods, estimating the number of species based on stratified random sampling of subunits (say, quadrats) from the region, and estimating the number of errors/defects in a product based on observations from two or more types of inspectors. When the differences between the strata are large, utilizing stratified random sampling and our estimator often results in superior performance versus the use of simple random sampling and its bootstrap or jackknife [Burnham and Overton (1978)] estimator. The superior performance is often associated with more observed classes, and we provide insights into optimal designation of the strata and optimal allocation of sample sectors to strata. 相似文献

19.

Approximate optimal allocation in repeated sampling from a finite population

P.M. Robinson 《Journal of statistical planning and inference》1985,11(2):135-148

Samples of size n are drawn from a finite population on each of two occasions. On the first occasion a variate x is measured, and on the second a variate y. In estimating the population mean of y, the variance of the best linear unbiased combination of means for matched and unmatched samples is itself minimized, with respect to the sampling design on the second occasion, by a certain degree of matching. This optimal allocation depends on the population correlation coefficient, which previous authors have assumed known. We estimate the correlation from an initial matched sample, then an approximately optimal allocation is completed and an estimator formed which, under a bivariate normal superpopulation model, has model expected mean square error equal, apart from an error of order n^-2, to the minimum enjoyed by any linear, unbiased estimator. 相似文献

20.

Selecting the better binomial population under a budget constraint

Kenneth J. Risko 《Journal of statistical planning and inference》1985,11(3):341-353

The problem of optimal non-sequential allocation of observations for the selection of the better binomial population is considered in the case of fixed sampling costs and budget. With the appropriate choice of selection rule it is shown that a 70% reduction in the probability of incorrect selection is possible by using an unequal rather than equal allocation. Simple formulae are given for the appropriate selection rule and unequal allocation in large samples. 相似文献