首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 359 毫秒
1.
Large governmental surveys typically provide accurate national statistics. To decrease the mean squared error of estimates for small areas, i.e., domains in which the sample size is small, auxiliary variables from administrative records are often used as covariates in a mixed linear model. It is generally assumed that the auxiliary information is available for every small area. In many cases, though, such information is available for only some of the small areas, either from another survey or from a previous administration of the same survey. The authors propose and study small area estimators that use multivariate models to combine information from several surveys. They discuss computational algorithms, and a simulation study indicates that if quantities in the different surveys are sufficiently correlated, substantial gains in efficiency can be achieved.  相似文献   

2.
Generalised variance function (GVF) models are data analysis techniques often used in large‐scale sample surveys to approximate the design variance of point estimators for population means and proportions. Some potential advantages of the GVF approach include operational simplicity, more stable sampling errors estimates and providing a convenient method of summarising results when a high number of survey variables is considered. In this paper, several parametric and nonparametric methods for GVF estimation with binary variables are proposed and compared. The behavior of these estimators is analysed under heteroscedasticity and in the presence of outliers and influential observations. An empirical study based on the annual survey of living conditions in Galicia (a region in the northwest of Spain) illustrates the behaviour of the proposed estimators.  相似文献   

3.
In this paper, a new small domain estimator for area-level data is proposed. The proposed estimator is driven by a real problem of estimating the mean price of habitation transaction at a regional level in a European country, using data collected from a longitudinal survey conducted by a national statistical office. At the desired level of inference, it is not possible to provide accurate direct estimates because the sample sizes in these domains are very small. An area-level model with a heterogeneous covariance structure of random effects assists the proposed combined estimator. This model is an extension of a model due to Fay and Herriot [5], but it integrates information across domains and over several periods of time. In addition, a modified method of estimation of variance components for time-series and cross-sectional area-level models is proposed by including the design weights. A Monte Carlo simulation, based on real data, is conducted to investigate the performance of the proposed estimators in comparison with other estimators frequently used in small area estimation problems. In particular, we compare the performance of these estimators with the estimator based on the Rao–Yu model [23]. The simulation study also accesses the performance of the modified variance component estimators in comparison with the traditional ANOVA method. Simulation results show that the estimators proposed perform better than the other estimators in terms of both precision and bias.  相似文献   

4.
The properties of the estimators of population mean arising from the ratio and product methods of estimation in the context of sample surveys have been analyzed in this paper when the observations on both the study and auxiliary variables are contaminated with measurement errors. The measurement errors in both the variables are also correlated. The properties of the ratio and product estimators along with the sample mean under the influence of measurement errors are derived and studied. The properties of the estimators in finite samples are studied through Monte-Carlo simulation and its findings are reported.  相似文献   

5.
We consider a non response-adjusted poststratified estimation when there exists a set of clear response homogeneity groups but the population distribution of that set is unknown, which is common in practice. We propose a partially calibrated poststratified estimator that is asymptotically unbiased and satisfies a calibration equation for the auxiliary variables of which the joint population distribution is known. We also provide a variance estimator of the proposed poststratified estimator. In a small simulation study, the proposed estimator performed better than or comparable to commonly used estimators.  相似文献   

6.
Calibration techniques in survey sampling, such as generalized regression estimation (GREG), were formalized in the 1990s to produce efficient estimators of linear combinations of study variables, such as totals or means. They implicitly lie on the assumption of a linear regression model between the variable of interest and some auxiliary variables in order to yield estimates with lower variance if the model is true and remaining approximately design-unbiased even if the model does not hold. We propose a new class of model-assisted estimators obtained by releasing a few calibration constraints and replacing them with a penalty term. This penalization is added to the distance criterion to minimize. By introducing the concept of penalized calibration, combining usual calibration and this ‘relaxed’ calibration, we are able to adjust the weight given to the available auxiliary information. We obtain a more flexible estimation procedure giving better estimates particularly when the auxiliary information is overly abundant or not fully appropriate to be completely used. Such an approach can also be seen as a design-based alternative to the estimation procedures based on the more general class of mixed models, presenting new prospects in some scopes of application such as inference on small domains.  相似文献   

7.
We present some unbiased estimators at the population mean in a finite population sample surveys with simple random sampling design where information on an auxiliary variance x positively correlated with the main variate y is available. Exact variance and unbiased estimate of the variance are computed for any sample size. These estimators are compared for their precision with the mean per unit and the ratio estimators. Modifications of the estimators are suggested to make them more precise than the mean per unit estimator or the ratio estimator regardless of the value of the population correlation coefficient between the variates x and y. Asymptotic distribution of our estimators and confidnece intervals for the population mean are also obtained.  相似文献   

8.
Much of the small‐area estimation literature focuses on population totals and means. However, users of survey data are often interested in the finite‐population distribution of a survey variable and in the measures (e.g. medians, quartiles, percentiles) that characterize the shape of this distribution at the small‐area level. In this paper we propose a model‐based direct estimator (MBDE, Chandra and Chambers) of the small‐area distribution function. The MBDE is defined as a weighted sum of sample data from the area of interest, with weights derived from the calibrated spline‐based estimate of the finite‐population distribution function introduced by Harms and Duchesne, under an appropriately specified regression model with random area effects. We also discuss the mean squared error estimation of the MBDE. Monte Carlo simulations based on both simulated and real data sets show that the proposed MBDE and its associated mean squared error estimator perform well when compared with alternative estimators of the area‐specific finite‐population distribution function.  相似文献   

9.
Estimation of price indexes in the United States is generally based on complex rotating panel surveys. The sample for the Consumer Price Index, for example, is selected in three stages—geographic areas, establishments, and individual items—with 20% of the sample being replaced by rotation each year. At each period, a time series of data is available for use in estimation. This article examines how to best combine data for estimation of long-term and short-term changes and how to estimate the variances of the index estimators in the context of two-stage sampling. I extend the class of estimators, introduced by Valliant and Miller, of Laspeyres indexes formed using sample data collected from the current period back to a previous base period. Linearization estimators of variance for indexes of long-term and short-term change are derived. The theory is supported by an empirical simulation study using two-stage sampling of establishments and items from a population derived from U.S. Bureau of Labor Statistics data.  相似文献   

10.
Small area estimation techniques are becoming increasingly used in survey applications to provide estimates for local areas of interest. The objective of this article is to develop and apply Information Theoretic (IT)-based formulations to estimate small area business and trade statistics. More specifically, we propose a Generalized Maximum Entropy (GME) approach to the problem of small area estimation that exploits auxiliary information relating to other known variables on the population and adjusts for consistency and additivity. The GME formulations, combining information from the sample together with out-of-sample aggregates of the population of interest, can be particularly useful in the context of small area estimation, for both direct and model-based estimators, since they do not require strong distributional assumptions on the disturbances. The performance of the proposed IT formulations is illustrated through real and simulated datasets.  相似文献   

11.
Adaptive cluster sampling is an efficient method of estimating the parameters of rare and clustered populations. The method mimics how biologists would like to collect data in the field by targeting survey effort to localised areas where the rare population occurs. Another popular sampling design is inverse sampling. Inverse sampling was developed so as to be able to obtain a sample of rare events having a predetermined size. Ideally, in inverse sampling, the resultant sample set will be sufficiently large to ensure reliable estimation of population parameters. In an effort to combine the good properties of these two designs, adaptive cluster sampling and inverse sampling, we introduce inverse adaptive cluster sampling with unequal selection probabilities. We develop an unbiased estimator of the population total that is applicable to data obtained from such designs. We also develop numerical approximations to this estimator. The efficiency of the estimators that we introduce is investigated through simulation studies based on two real populations: crabs in Al Khor, Qatar and arsenic pollution in Kurdistan, Iran. The simulation results show that our estimators are efficient.  相似文献   

12.
This article considers the case where two surveys collect data on a common variable, with one survey being much smaller than the other. The smaller survey collects data on an additional variable of interest, related to the common variable collected in the two surveys, and out-of-scope with respect to the larger survey. Estimation of the two related variables is of interest at domains defined at a granular level. We propose a multilevel model for integrating data from the two surveys, by reconciling survey estimates available for the common variable, accounting for the relationship between the two variables, and expanding estimation for the other variable, for all the domains of interest. The model is specified as a hierarchical Bayes model for domain-level survey data, and posterior distributions are constructed for the two variables of interest. A synthetic estimation approach is considered as an alternative to the hierarchical modelling approach. The methodology is applied to wage and benefits estimation using data from the National Compensation Survey and the Occupational Employment Statistics Survey, available from the Bureau of Labor Statistics, Department of Labor, United States.  相似文献   

13.
This paper compares minimum distance estimation with best linear unbiased estimation to determine which technique provides the most accurate estimates for location and scale parameters as applied to the three parameter Pareto distribution. Two minimum distance estimators are developed for each of the three distance measures used (Kolmogorov, Cramer‐von Mises, and Anderson‐Darling) resulting in six new estimators. For a given sample size 6 or 18 and shape parameter 1(1)4, the location and scale parameters are estimated. A Monte Carlo technique is used to generate the sample sets. The best linear unbiased estimator and the six minimum distance estimators provide parameter estimates based on each sample set. These estimates are compared using mean square error as the evaluation tool. Results show that the best linear unbaised estimator provided more accurate estimates of location and scale than did the minimum estimators tested.  相似文献   

14.
Sample surveys are usually designed and analysed to produce estimates for larger areas. Nevertheless, sample sizes are often not large enough to give adequate precision for small area estimates of interest. To overcome such difficulties, borrowing strength from related small areas via modelling becomes essential. In line with this, we propose components of variance models with power transformations for small area estimation. This paper reports the results of a study aimed at incorporating the power transformation in small area estimation for improving the quality of small area predictions. The proposed methods are demonstrated on satellite data in conjunction with survey data to estimate mean acreage under a specified crop for counties in Iowa.  相似文献   

15.
Small area estimation (SAE) concerns with how to reliably estimate population quantities of interest when some areas or domains have very limited samples. This is an important issue in large population surveys, because the geographical areas or groups with only small samples or even no samples are often of interest to researchers and policy-makers. For example, large population health surveys, such as Behavioural Risk Factor Surveillance System and Ohio Mecaid Assessment Survey (OMAS), are regularly conducted for monitoring insurance coverage and healthcare utilization. Classic approaches usually provide accurate estimators at the state level or large geographical region level, but they fail to provide reliable estimators for many rural counties where the samples are sparse. Moreover, a systematic evaluation of the performances of the SAE methods in real-world setting is lacking in the literature. In this paper, we propose a Bayesian hierarchical model with constraints on the parameter space and show that it provides superior estimators for county-level adult uninsured rates in Ohio based on the 2012 OMAS data. Furthermore, we perform extensive simulation studies to compare our methods with a collection of common SAE strategies, including direct estimators, synthetic estimators, composite estimators, and Datta GS, Ghosh M, Steorts R, Maples J.'s [Bayesian benchmarking with applications to small area estimation. Test 2011;20(3):574–588] Bayesian hierarchical model-based estimators. To set a fair basis for comparison, we generate our simulation data with characteristics mimicking the real OMAS data, so that neither model-based nor design-based strategies use the true model specification. The estimators based on our proposed model are shown to outperform other estimators for small areas in both simulation study and real data analysis.  相似文献   

16.
Sampling has evolved into a universally accepted approach for gathering information and data mining as it is widely accepted that a reasonably modest-sized sample can sufficiently characterize a much larger population. In stratified sampling designs, the whole population is divided into homogeneous strata in order to achieve higher precision in the estimation. This paper proposes an efficient method of constructing optimum stratum boundaries (OSB) and determining optimum sample size (OSS) for the survey variable. The survey variable may not be available in practice since the variable of interest is unavailable prior to conducting the survey. Thus, the method is based on the auxiliary variable which is usually readily available from past surveys. To illustrate the application as an example using a real data, the auxiliary variable considered for this problem follows Weibull distribution. The stratification problem is formulated as a Mathematical Programming Problem (MPP) that seeks minimization of the variance of the estimated population parameter under Neyman allocation. The solution procedure employs the dynamic programming technique, which results in substantial gains in the precision of the estimates of the population characteristics.  相似文献   

17.
Model-based estimators are becoming very popular in statistical offices because Governments require accurate estimates for small domains that were not planned when the study was designed, as their inclusion would have produced an increase in the cost of the study. The sample sizes in these domains are very small or even zero; consequently, traditional direct design-based estimators lead to unacceptably large standard errors. In this regard, model-based estimators that 'borrow information' from related areas by using auxiliary information are appropriate. This paper reviews, under the model-based approach, a BLUP synthetic and an EBLUP estimator. The goal is to obtain estimators of domain totals when there are several domains with very small sample sizes or without sampled units. We also provide detailed expressions of the mean squared error at different levels of aggregation. The results are illustrated with real data from the Basque Country Business Survey.  相似文献   

18.
An application of empirical Bayes and Kalman filtering tecniques is reported, using live data from Indian Statistical Institute (ISI), Calcutta . to illustrate how initial small domain estimators may be vastly improved upon. A stratified two stage sampling procedure is adopted, allowing selection of first stage units with unequal probabilities but of second stage units with equal probabilities. Standard design-based estimators for domain totals are initialized based on domain specific survey data alone. Strength is then borrowed across domains and from past surveys. The resulting gains in efficacy are numlerically demonstrated, through replicated sampling from official records.  相似文献   

19.
Abstract

Nonparametric estimation of population size is a long standing and difficult problem. It is difficult because, particularly from a likelihood perspective, the underlying distribution could vary greatly and many small probability events may not be observed in a sample. However if approached from an entropic standpoint, certain trends can be exploited. This article proposes several estimators based on an entropic representation of population size, and establishes their consistency. Simulation results of the proposed estimators are also reported in comparison with a well-known estimator, and the advantages are noted. Two examples with real data are also included.  相似文献   

20.
Survey statisticians make use of auxiliary information to improve estimates. One important example is calibration estimation, which constructs new weights that match benchmark constraints on auxiliary variables while remaining “close” to the design weights. Multiple-frame surveys are increasingly used by statistical agencies and private organizations to reduce sampling costs and/or avoid frame undercoverage errors. Several ways of combining estimates derived from such frames have been proposed elsewhere; in this paper, we extend the calibration paradigm, previously used for single-frame surveys, to calculate the total value of a variable of interest in a dual-frame survey. Calibration is a general tool that allows to include auxiliary information from two frames. It also incorporates, as a special case, certain dual-frame estimators that have been proposed previously. The theoretical properties of our class of estimators are derived and discussed, and simulation studies conducted to compare the efficiency of the procedure, using different sets of auxiliary variables. Finally, the proposed methodology is applied to real data obtained from the Barometer of Culture of Andalusia survey.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号