首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 46 毫秒
In finite population sampling, often a distinction is made between model-and design-based estimators of the parameters of interest (like the population total, population variance, etc.). The model-based estimators depend on the (known) parameters of the model, while the design-based estimators depend on the (known) selection probabilities of the different units in the population. It is shown in this paper that the two approaches are not necessarily incompatible, and indeed can often lead to the same estimator. Our ideas are illustrated with the Horvitz-Thompson, and the generalized Horvitz-Thompson estimator. These estimators are identified as hierarchical Bays estimators. Also, certain “stepwise-Bayes” estimators of Vardeman and Meeden (J. Stat. Inf. (1983), V7, pp 329-341) are unified from a hierarchical Bayes point of view.  相似文献   

抽样调查中基于模型推断方法获得的估计量性质是依赖于模型的。在恰当的模型下比率估计和扩张估计是最优线性无偏估计。当模型设定错误时,比率估计和扩张估计是有偏估计,但如果样本是平衡的,可以消除偏倚,从而实现了复杂问题简单处理的思想。  相似文献   

Model-based estimators are becoming very popular in statistical offices because Governments require accurate estimates for small domains that were not planned when the study was designed, as their inclusion would have produced an increase in the cost of the study. The sample sizes in these domains are very small or even zero; consequently, traditional direct design-based estimators lead to unacceptably large standard errors. In this regard, model-based estimators that 'borrow information' from related areas by using auxiliary information are appropriate. This paper reviews, under the model-based approach, a BLUP synthetic and an EBLUP estimator. The goal is to obtain estimators of domain totals when there are several domains with very small sample sizes or without sampled units. We also provide detailed expressions of the mean squared error at different levels of aggregation. The results are illustrated with real data from the Basque Country Business Survey.  相似文献   

Suppose that a finite population consists of N distinct units. Associated with the ith unit is a polychotomous response vector, d i , and a vector of auxiliary variable x i . The values x i ’s are known for the entire population but d i ’s are known only for the units selected in the sample. The problem is to estimate the finite population proportion vector P. One of the fundamental questions in finite population sampling is how to make use of the complete auxiliary information effectively at the estimation stage. In this article a predictive estimator is proposed which incorporates the auxiliary information at the estimation stage by invoking a superpopulation model. However, the use of such estimators is often criticized since the working superpopulation model may not be correct. To protect the predictive estimator from the possible model failure, a nonparametric regression model is considered in the superpopulation. The asymptotic properties of the proposed estimator are derived and also a bootstrap-based hybrid re-sampling method for estimating the variance of the proposed estimator is developed. Results of a simulation study are reported on the performances of the predictive estimator and its re-sampling-based variance estimator from the model-based viewpoint. Finally, a data survey related to the opinions of 686 individuals on the cause of addiction is used for an empirical study to investigate the performance of the nonparametric predictive estimator from the design-based viewpoint.  相似文献   

To estimate model parameters from complex sample data. we apply maximum likelihood techniques to the complex sample data from the finite population, which is treated as a sample from an i nfinite superpopulation. General asymptotic distribution theory is developed and then applied to both logistic regression and discrete proportional hazards models. Data from the Lipid Research Clinics Program areused to illustrate each model, demonstrating the effects on inference of neglecting the sampling design during parameter estimation. These empirical results also shed light on the issue of model-based vs. design-based inferences.  相似文献   

This paper examines strategies for estimating the mean of a finite population in the following situation: A linear regression model is assumed to describe the population scatter. Various estimators β for the vector of regression parameters β are considered. Several ways of transforming each estimator β into a model-based estimator for the population mean are considered. Some estimators constructed in this way become sensitive to correctness of the assumed model. The estimators favoured in this paper are the ones in which the observations are weighted to reflect the sampling design, so that asymptotic design unbiasedness is achieved. For these estimators, the randomization distribution gives protection against model breakdown.  相似文献   

This article describes a method for developing an efficiently stratified sampling design for a generalized difference estimator, using a superpopulation model. The effectiveness of model-based stratification is compared to several conventional designs, using a data base from a complete audit of an inventory of 8,069 items. In this application, model-based designs reduce the required sample size from 9% to 40%, compared to the conventional designs.  相似文献   

A researcher using complex longitudinal survey data for event history analysis has to make several choices that affect the analysis results. These choices include the following: whether a design-based or a model-based approach for the analysis is taken, which subset of data to use and, if a design-based approach is chosen, which weights to use. We discuss different choices and illustrate their effects using longitudinal register data linked at person-level with the Finnish subset of the European Community Household Panel data. The use of register data enables us to construct an event history data set without nonresponse and attrition. Design-based estimates from these data are used as benchmarks against design-based and model-based estimates from subsets of data usually available for a survey data analyst. Our illustration suggests that the often recommended way to use panel data for longitudinal analyses, data from total respondents and weights from the last wave analysed may not be the best way to go. Instead, using all available data and weights from the first survey wave appears to be a safe choice for longitudinal analyses based on multipurpose survey data.  相似文献   

Two classes of methods properly account for clustering of data: design-based methods and model-based methods. Estimates from both methods have been shown to be approximately equal with large samples. However, both classes are known to produce biased standard error estimates with small samples. This paper compares the bias of standard errors and statistical power of marginal effects for generalized estimating equations (a design-based method) and generalized/linear mixed effects models (model-based methods) with small sample sizes via a simulation study. Provided that the distributional assumptions are met, model-based methods produced the least-biased standard error estimates and greater relative statistical power.  相似文献   

We extend the random permutation model to obtain the best linear unbiased estimator of a finite population mean accounting for auxiliary variables under simple random sampling without replacement (SRS) or stratified SRS. The proposed method provides a systematic design-based justification for well-known results involving common estimators derived under minimal assumptions that do not require specification of a functional relationship between the response and the auxiliary variables.  相似文献   

The high level of unemployment is a major problem in most European countries nowadays. Hence, the demand for small area labor market statistics has rapidly increased over the past few years. The Portuguese Labour Force Survey is the main source of official statistics at the macro level. However, it was not designed to produce reliable design-based statistics at the micro level due to small sample sizes. The goal of this article is to analyze the performance of model-based small area estimators to estimate the unemployment rate at micro level. Our results showed that the temporal estimator is the most suitable.  相似文献   

Sample surveys for estimating the abundance of wildlife ungulate populations are considered in a design-based approach. On the basis of previous theoretical results, a two-stage sampling is proposed. In the first stage, some spatial units are selected using Lahiri-Midzuno sampling, while in the second stage, the animal abundance in the selected units is estimated by means of plot sampling performed on the faecal accumulation within the units. The statistical properties of the resulting ratio estimator of abundance are outlined. An application of the proposed method for estimating fallow-deer and roe-deer abundance in Maremma Regional Park is described.  相似文献   

Kernel density estimation has been used with great success with data that may be assumed to be generated from independent and identically distributed (iid) random variables. The methods and theoretical results for iid data, however, do not directly apply to data from stratified multistage samples. We present finite-sample and asymptotic properties of a modified density estimator introduced in Buskirk (Proceedings of the Survey Research Methods Section, American Statistical Association (1998), pp. 799–801) and Bellhouse and Stafford (Statist. Sin. 9 (1999) 407–424); this estimator incorporates both the sampling weights and the kernel weights. We present regularity conditions which lead the sample estimator to be consistent and asymptotically normal under various modes of inference used with sample survey data. We also introduce a superpopulation structure for model-based inference that allows the population model to reflect naturally occurring clustering. The estimator, and confidence bands derived from the sampling design, are illustrated using data from the US National Crime Victimization Survey and the US National Health and Nutrition Examination Survey.  相似文献   

This article reviews four area-level linear mixed models that borrow strength by exploiting the possible correlation among the neighboring areas or/and past time periods. Its main goal is to study if there are efficiency gains when a spatial dependence or/and a temporal autocorrelation among random-area effects are included into the models. The Fay–Herriot estimator is used as benchmark. A design-based simulation study based on real data collected from a longitudinal survey conducted by a statistical office is presented. Our results show that models that explore both spatial and chronological association considerably improve the efficiency of small area estimates.  相似文献   

A technique is presented for estimating the size of a closed population from multiple recapture data when sampling is performed without replacement on the last trapping occasion. The estimator of the population size along with the variance estimator is derived from a log-linear model.  相似文献   

In the presence of multicollinearity the literature points to principal component regression (PCR) as an estimation method for the regression coefficients of a multiple regression model. Due to ambiguities in the interpretation, involved by the orthogonal transformation of the set of explanatory variables, the method could not yet gain wide acceptance. Factor analysis regression (FAR) provides a model-based estimation method which is particularly tailored to overcome multicollinearity in an errors-in-variables setting. In this paper two feasible versions of a FAR estimator are compared with the OLS estimator and the PCR estimator by means of Monte Carlo simulation. While the PCR estimator performs best in cases of strong and high multicollinearity, the Thomson-based FAR estimator proves to be superior when the regressors are moderately correlated.  相似文献   

A Bayesian estimator for the total number of distinct species present in the region of investigation is constructed when the quadrat sampling procedure is used to collect a sample of species. The estimator is based on a model similar to that used by Mingoti and Meeden, and uses as a special case the zero truncated negative binomial distribution as a prior distribution for the true number S of distinct species in the region. Confidence intervals are also obtained. Simple comparisons with the first-order jackknife estimator and the empirical Bayesian estimator are performed.  相似文献   

We consider the method of distance sampling described by Buckland, Anderson, Burnham and Laake in 1993. We explore the properties of the methodology in simple cases chosen to allow direct and accessible comparisons of distance sampling in the design- and model-based frameworks. In particular, we obtain expressions for the bias and variance of the distance sampling estimator of object density and for the expected value of the recommended analytic variance estimator within each framework. These results enable us to clarify aspects of the performance of the methodology which may be of interest to users and potential users of distance sampling.  相似文献   

It is well-known in the literature on multicollinearity that one of the major consequences of multicollinearity on the ordinary least squares estimator is that the estimator produces large sampling variances, which in turn might inappropriately lead to exclusion of otherwise significant coefficients from the model. To circumvent this problem, two accepted estimation procedures which are often suggested are the restricted least squares method and the ridge regression method. While the former leads to a reduction in the sampling variance of the estimator, the later ensures a smaller mean square error value for the estimator. In this paper we have proposed a new estimator which is based on a criterion that combines the ideas underlying these two estimators. The standard properties of this new estimator have been studied in the paper. It has also been shown that this estimator is superior to both the restricted least squares as well as the ordinary ridge regression estimators by the criterion of mean sauare error of the estimator of the regression coefficients when the restrictions are indeed correct. The conditions for superiority of this estimator over the other two have also been derived for the situation when the restrictions are not correct.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号