首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 515 毫秒
1.
This paper describes small area estimation (SAE) of proportions under a spatial dependent generalized linear mixed model using aggregated level data. The SAE is also applied to produce reliable district level estimates and mapping of incidence of indebtedness in the State of Uttar Pradesh in India using debt and investment survey data collected by National Sample Survey Office (NSSO) and the secondary data from the Census. The results show a significant improvement in precision of model-based estimates generated by SAE as compared to direct estimates. The estimates generated by incorporating spatial information are more efficient than the one generated by ignoring this information.  相似文献   

2.
The National Sample Survey Organisation (NSSO) surveys are the main source of official statistics in India, and generate a range of invaluable data at the macro level (e.g. state and national levels). However, the NSSO data cannot be used directly to produce reliable estimates at the micro level (e.g. district or further disaggregate level) due to small sample sizes. There is a rapidly growing demand of such micro-level statistics in India, as the country is moving from centralized to more decentralized planning system. In this article, we employ small-area estimation (SAE) techniques to derive model-based estimates of the proportion of indebted households at district or at other small-area levels in the state of Uttar Pradesh in India by linking data from the Debt–Investment Survey 2002–2003 of NSSO and the Population Census 2001 and the Agriculture Census 2003. Our results show that the model-based estimates are precise and representative. For many small areas, it is even not possible to produce estimates using sample data alone. The model-based estimates generated using SAE are still reliable for such areas. The estimates are expected to provide invaluable information to policy analysts and decision-makers.  相似文献   

3.
Unit level linear mixed models are often used in small area estimation (SAE), and the empirical best linear unbiased prediction (EBLUP) is widely used for the estimation of small area means under such models. However, EBLUP requires population level auxiliary data, atleast area specific aggregated values. Sometimes population level auxiliary data is either not available or not consistent with the survey data. We describe a SAE method that uses estimated population auxiliary information. Empirical results show that proposed method for SAE produces an efficient set of small area estimates.  相似文献   

4.
The commonly used method of small area estimation (SAE) under a linear mixed model may not be efficient if data contain substantial proportion of zeros than would be expected under standard model assumptions (hereafter zero-inflated data). The authors discuss the SAE for zero-inflated data under a two-part random effects model that account for excess zeros in the data. Empirical results show that proposed method for SAE works well and produces an efficient set of small area estimates. An application to real survey data from the National Sample Survey Office of India demonstrates the satisfactory performance of the method. The authors describe a parametric bootstrap method to estimate the mean squared error (MSE) of the proposed estimator of small areas. The bootstrap estimates of the MSE are compared to the true MSE in simulation study.  相似文献   

5.
M-quantile models with application to poverty mapping   总被引:1,自引:0,他引:1  
Over the last decade there has been growing demand for estimates of population characteristics at small area level. Unfortunately, cost constraints in the design of sample surveys lead to small sample sizes within these areas and as a result direct estimation, using only the survey data, is inappropriate since it yields estimates with unacceptable levels of precision. Small area models are designed to tackle the small sample size problem. The most popular class of models for small area estimation is random effects models that include random area effects to account for between area variations. However, such models also depend on strong distributional assumptions, require a formal specification of the random part of the model and do not easily allow for outlier robust inference. An alternative approach to small area estimation that is based on the use of M-quantile models was recently proposed by Chambers and Tzavidis (Biometrika 93(2):255–268, 2006) and Tzavidis and Chambers (Robust prediction of small area means and distributions. Working paper, 2007). Unlike traditional random effects models, M-quantile models do not depend on strong distributional assumption and automatically provide outlier robust inference. In this paper we illustrate for the first time how M-quantile models can be practically employed for deriving small area estimates of poverty and inequality. The methodology we propose improves the traditional poverty mapping methods in the following ways: (a) it enables the estimation of the distribution function of the study variable within the small area of interest both under an M-quantile and a random effects model, (b) it provides analytical, instead of empirical, estimation of the mean squared error of the M-quantile small area mean estimates and (c) it employs a robust to outliers estimation method. The methodology is applied to data from the 2002 Living Standards Measurement Survey (LSMS) in Albania for estimating (a) district level estimates of the incidence of poverty in Albania, (b) district level inequality measures and (c) the distribution function of household per-capita consumption expenditure in each district. Small area estimates of poverty and inequality show that the poorest Albanian districts are in the mountainous regions (north and north east) with the wealthiest districts, which are also linked with high levels of inequality, in the coastal (south west) and southern part of country. We discuss the practical advantages of our methodology and note the consistency of our results with results from previous studies. We further demonstrate the usefulness of the M-quantile estimation framework through design-based simulations based on two realistic survey data sets containing small area information and show that the M-quantile approach may be preferable when the aim is to estimate the small area distribution function.  相似文献   

6.
This paper develops a method of estimating micro-level poverty in cases where data are scarce. The method is applied to estimate district-level poverty using the household level Indian national sample survey data for two states, viz., West Bengal and Madhya Pradesh. The method involves estimation of state-level poverty indices from the data formed by pooling data of all the districts (each time excluding one district) and multiplying this poverty vector with a known weight matrix to obtain the unknown district-level poverty vector. The proposed method is expected to yield reliable estimates at the district level, because the district-level estimate is now based on a much larger sample size obtained by pooling data of several districts. This method can be an alternative to the “small area estimation technique” for estimating poverty at sub-state levels in developing countries.  相似文献   

7.
If unit‐level data are available, small area estimation (SAE) is usually based on models formulated at the unit level, but they are ultimately used to produce estimates at the area level and thus involve area‐level inferences. This paper investigates the circumstances under which using an area‐level model may be more effective. Linear mixed models (LMMs) fitted using different levels of data are applied in SAE to calculate synthetic estimators and empirical best linear unbiased predictors (EBLUPs). The performance of area‐level models is compared with unit‐level models when both individual and aggregate data are available. A key factor is whether there are substantial contextual effects. Ignoring these effects in unit‐level working models can cause biased estimates of regression parameters. The contextual effects can be automatically accounted for in the area‐level models. Using synthetic and EBLUP techniques, small area estimates based on different levels of LMMs are investigated in this paper by means of a simulation study.  相似文献   

8.
Binary data are often of interest in business surveys, particularly when the aim is to characterize grouping in the businesses making up the survey population. When small area estimates are required for such binary data, use of standard estimation methods based on linear mixed models (LMMs) becomes problematic. We explore two model-based techniques of small area estimation for small area proportions, the empirical best predictor (EBP) under a generalized linear mixed model and the model-based direct estimator (MBDE) under a population-level LMM. Our empirical results show that both the MBDE and the EBP perform well. The EBP is a computationally intensive method, whereas the MBDE is easy to implement. In case of model misspecification, the MBDE also appears to be more robust. The mean-squared error (MSE) estimation of MBDE is simple and straightforward, which is in contrast to the complicated MSE estimation for the EBP.  相似文献   

9.
Model-based estimators are becoming very popular in statistical offices because Governments require accurate estimates for small domains that were not planned when the study was designed, as their inclusion would have produced an increase in the cost of the study. The sample sizes in these domains are very small or even zero; consequently, traditional direct design-based estimators lead to unacceptably large standard errors. In this regard, model-based estimators that 'borrow information' from related areas by using auxiliary information are appropriate. This paper reviews, under the model-based approach, a BLUP synthetic and an EBLUP estimator. The goal is to obtain estimators of domain totals when there are several domains with very small sample sizes or without sampled units. We also provide detailed expressions of the mean squared error at different levels of aggregation. The results are illustrated with real data from the Basque Country Business Survey.  相似文献   

10.
In survey sampling, policy decisions regarding the allocation of resources to sub‐groups of a population depend on reliable predictors of their underlying parameters. However, in some sub‐groups, called small areas due to small sample sizes relative to the population, the information needed for reliable estimation is typically not available. Consequently, data on a coarser scale are used to predict the characteristics of small areas. Mixed models are the primary tools in small area estimation (SAE) and also borrow information from alternative sources (e.g., previous surveys and administrative and census data sets). In many circumstances, small area predictors are associated with location. For instance, in the case of chronic disease or cancer, it is important for policy makers to understand spatial patterns of disease in order to determine small areas with high risk of disease and establish prevention strategies. The literature considering SAE with spatial random effects is sparse and mostly in the context of spatial linear mixed models. In this article, small area models are proposed for the class of spatial generalized linear mixed models to obtain small area predictors and corresponding second‐order unbiased mean squared prediction errors via Taylor expansion and a parametric bootstrap approach. The performance of the proposed approach is evaluated through simulation studies and application of the models to a real esophageal cancer data set from Minnesota, U.S.A. The Canadian Journal of Statistics 47: 426–437; 2019 © 2019 Statistical Society of Canada  相似文献   

11.
Small area estimation techniques are becoming increasingly used in survey applications to provide estimates for local areas of interest. The objective of this article is to develop and apply Information Theoretic (IT)-based formulations to estimate small area business and trade statistics. More specifically, we propose a Generalized Maximum Entropy (GME) approach to the problem of small area estimation that exploits auxiliary information relating to other known variables on the population and adjusts for consistency and additivity. The GME formulations, combining information from the sample together with out-of-sample aggregates of the population of interest, can be particularly useful in the context of small area estimation, for both direct and model-based estimators, since they do not require strong distributional assumptions on the disturbances. The performance of the proposed IT formulations is illustrated through real and simulated datasets.  相似文献   

12.
Two classes of methods properly account for clustering of data: design-based methods and model-based methods. Estimates from both methods have been shown to be approximately equal with large samples. However, both classes are known to produce biased standard error estimates with small samples. This paper compares the bias of standard errors and statistical power of marginal effects for generalized estimating equations (a design-based method) and generalized/linear mixed effects models (model-based methods) with small sample sizes via a simulation study. Provided that the distributional assumptions are met, model-based methods produced the least-biased standard error estimates and greater relative statistical power.  相似文献   

13.
Small area estimation (SAE) concerns with how to reliably estimate population quantities of interest when some areas or domains have very limited samples. This is an important issue in large population surveys, because the geographical areas or groups with only small samples or even no samples are often of interest to researchers and policy-makers. For example, large population health surveys, such as Behavioural Risk Factor Surveillance System and Ohio Mecaid Assessment Survey (OMAS), are regularly conducted for monitoring insurance coverage and healthcare utilization. Classic approaches usually provide accurate estimators at the state level or large geographical region level, but they fail to provide reliable estimators for many rural counties where the samples are sparse. Moreover, a systematic evaluation of the performances of the SAE methods in real-world setting is lacking in the literature. In this paper, we propose a Bayesian hierarchical model with constraints on the parameter space and show that it provides superior estimators for county-level adult uninsured rates in Ohio based on the 2012 OMAS data. Furthermore, we perform extensive simulation studies to compare our methods with a collection of common SAE strategies, including direct estimators, synthetic estimators, composite estimators, and Datta GS, Ghosh M, Steorts R, Maples J.'s [Bayesian benchmarking with applications to small area estimation. Test 2011;20(3):574–588] Bayesian hierarchical model-based estimators. To set a fair basis for comparison, we generate our simulation data with characteristics mimicking the real OMAS data, so that neither model-based nor design-based strategies use the true model specification. The estimators based on our proposed model are shown to outperform other estimators for small areas in both simulation study and real data analysis.  相似文献   

14.
Small area statistics obtained from sample survey data provide a critical source of information used to study health, economic, and sociological trends. However, most large-scale sample surveys are not designed for the purpose of producing small area statistics. Moreover, data disseminators are prevented from releasing public-use microdata for small geographic areas for disclosure reasons; thus, limiting the utility of the data they collect. This research evaluates a synthetic data method, intended for data disseminators, for releasing public-use microdata for small geographic areas based on complex sample survey data. The method replaces all observed survey values with synthetic (or imputed) values generated from a hierarchical Bayesian model that explicitly accounts for complex sample design features, including stratification, clustering, and sampling weights. The method is applied to restricted microdata from the National Health Interview Survey and synthetic data are generated for both sampled and non-sampled small areas. The analytic validity of the resulting small area inferences is assessed by direct comparison with the actual data, a simulation study, and a cross-validation study.  相似文献   

15.
Spatial robust small area estimation   总被引:1,自引:0,他引:1  
The accuracy of recent applications in small area statistics in many cases highly depends on the assumed properties of the underlying models and the availability of micro information. In finite population sampling, small sample sizes may increase the sensitivity of the modeling with respect to single units. In these cases, area-specific sample sizes tend to be small such that normal assumptions, even of area means, seem to be violated. Hence, applying robust estimation methods is expected to yield more reliable results. In general, two robust small area methods are applied, the robust EBLUP and the M-quantile method. Additionally, the use of adequate auxiliary information may further increase the accuracy of the estimates. In prediction based approaches where information is needed on universe level, in general, only few variables are available which can be used for modeling. In addition to variables from the dataset, in many cases further information may be available, e.g. geographical information which could indicate spatial dependencies between neighboring areas. This spatial information can be included in the modeling using spatially correlated area effects. Within the paper the classical robust EBLUP is extended to cover spatial area effects via a simultaneous autoregressive model. The performance of the different estimators are compared in a model-based simulation study.  相似文献   

16.
Sample surveys are usually designed and analysed to produce estimates for larger areas. Nevertheless, sample sizes are often not large enough to give adequate precision for small area estimates of interest. To overcome such difficulties, borrowing strength from related small areas via modelling becomes essential. In line with this, we propose components of variance models with power transformations for small area estimation. This paper reports the results of a study aimed at incorporating the power transformation in small area estimation for improving the quality of small area predictions. The proposed methods are demonstrated on satellite data in conjunction with survey data to estimate mean acreage under a specified crop for counties in Iowa.  相似文献   

17.
The paper deals with the matter of producing geographical domains estimates for a variable with a spatial pattern in presence of incomplete information about the population units location. The spatial distribution of the study variable and its eventual relations with other covariates are modeled by a geoadditive regression. The use of such a model to produce model-based estimates for some geographical domains requires all the population units to be referenced at point locations, however typically the spatial coordinates are known only for the sampled units. An approach to treat the lack of geographical information for non-sampled units is suggested: it is proposed to impose a distribution on the spatial locations inside each domain. This is realized through a hierarchical Bayesian formulation of the geoadditive model in which a prior distribution on the spatial coordinates is defined. The performance of the proposed imputation approach is evaluated through various Markov Chain Monte Carlo experiments implemented under different scenarios.  相似文献   

18.
ABSTRACT

Multivariate Fay-Herriot (MFH) models become popular methods to produce reliable parameter estimates of some related multiple characteristics of interest that are commonly produced from many surveys. This article studies the application of MFH models for estimating household consumption per capita expenditure (HCPE) on food and HCPE of non-food. Both of those associated direct estimates, which are obtained from the National Socioeconomic Surveys conducted regularly by Statistics Indonesia, have a strong correlation. The effects of correlation in MFH models are evaluated by employing a simulation study. The simulation showed that the strength of correlation between variables of interest, instead of the number of domains, plays a prominent role in MFH models. The application showed that MFH models have more efficient than univariate models in terms of standard errors of regression parameter estimates. The roots of mean squared errors (RMSEs) of the estimates obtained from the empirical best linear unbiased prediction (EBLUP) estimators of MFH models are smaller than RMSEs obtained from the direct estimators. Based on MFH model, the HCPE estimates of food by districts in Central Java, Indonesia, are higher than the HCPE estimates of non-food. The average of HCPE estimates of food and non-food in Central Java, Indonesia in 2015 are IDR 383,100.6 and IDR 280,653.6, respectively.  相似文献   

19.
ESTIMATING A LOGIT MODEL WITH RANDOMIZED DATA: THE CASE OF COCAINE USE   总被引:1,自引:0,他引:1  
In his influential book, Maddala (1983) suggests combining randomized response survey data with other personal information to estimate logit models predicting immoral, unpopular, or unlawful behaviour. This study is one of the first to implement this technique using real data. Models of college students' recent cocaine use are estimated with academic performance and socio-economic characteristics as determinants. Parameter estimates obtained from randomized response surveys are compared to those obtained using conventional, direct question surveys. The results indicate that randomized response estimates provide useful information on the degree to which inferences regarding the determinants of cocaine use are sensitive to survey type.  相似文献   

20.
Small area estimation plays a prominent role in survey sampling due to a growing demand for reliable small area estimates from both public and private sectors. Popularity of model-based inference is increasing in survey sampling, particularly, in small area estimation. The estimates of the small area parameters can profitably ‘borrow strength’ from data on related multiple characteristics and/or auxiliary variables from other neighboring areas through appropriate models. Fay (1987, Small Area Statistics, Wiley, New York, pp. 91–102) proposed multivariate regression for small area estimation of multiple characteristics. The success of this modeling rests essentially on the strength of correlation of these dependent variables. To estimate small area mean vectors of multiple characteristics, multivariate modeling has been proposed in the literature via a multivariate variance components model. We use this approach to empirical best linear unbiased and empirical Bayes prediction of small area mean vectors. We use data from Battese et al. (1988, J. Amer. Statist. Assoc. 83, 28 –36) to conduct a simulation which shows that the multivariate approach may achieve substantial improvement over the usual univariate approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号