期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

What Level of Statistical Model Should We Use in Small Area Estimation?

Mohammad‐Reza Namazi‐Rad David Steel 《Australian & New Zealand Journal of Statistics》2015,57(2):275-298

If unit‐level data are available, small area estimation (SAE) is usually based on models formulated at the unit level, but they are ultimately used to produce estimates at the area level and thus involve area‐level inferences. This paper investigates the circumstances under which using an area‐level model may be more effective. Linear mixed models (LMMs) fitted using different levels of data are applied in SAE to calculate synthetic estimators and empirical best linear unbiased predictors (EBLUPs). The performance of area‐level models is compared with unit‐level models when both individual and aggregate data are available. A key factor is whether there are substantial contextual effects. Ignoring these effects in unit‐level working models can cause biased estimates of regression parameters. The contextual effects can be automatically accounted for in the area‐level models. Using synthetic and EBLUP techniques, small area estimates based on different levels of LMMs are investigated in this paper by means of a simulation study. 相似文献

2.

Robust small area estimation

Sanjoy K. Sinha J. N. K. Rao 《Revue canadienne de statistique》2009,37(3):381-399

Small area estimation has received considerable attention in recent years because of growing demand for small area statistics. Basic area‐level and unit‐level models have been studied in the literature to obtain empirical best linear unbiased prediction (EBLUP) estimators of small area means. Although this classical method is useful for estimating the small area means efficiently under normality assumptions, it can be highly influenced by the presence of outliers in the data. In this article, the authors investigate the robustness properties of the classical estimators and propose a resistant method for small area estimation, which is useful for downweighting any influential observations in the data when estimating the model parameters. To estimate the mean squared errors of the robust estimators of small area means, a parametric bootstrap method is adopted here, which is applicable to models with block diagonal covariance structures. Simulations are carried out to study the behaviour of the proposed robust estimators in the presence of outliers, and these estimators are also compared to the EBLUP estimators. Performance of the bootstrap mean squared error estimator is also investigated in the simulation study. The proposed robust method is also applied to some real data to estimate crop areas for counties in Iowa, using farm‐interview data on crop areas and LANDSAT satellite data as auxiliary information. The Canadian Journal of Statistics 37: 381–399; 2009 © 2009 Statistical Society of Canada 相似文献

3.

Assessing different uncertainty measures of EBLUP: a resampling-based approach

《Journal of Statistical Computation and Simulation》2012,82(7):713-727

The empirical best linear unbiased prediction approach is a popular method for the estimation of small area parameters. However, the estimation of reliable mean squared prediction error (MSPE) of the estimated best linear unbiased predictors (EBLUP) is a complicated process. In this paper we study the use of resampling methods for MSPE estimation of the EBLUP. A cross-sectional and time-series stationary small area model is used to provide estimates in small areas. Under this model, a parametric bootstrap procedure and a weighted jackknife method are introduced. A Monte Carlo simulation study is conducted in order to compare the performance of different resampling-based measures of uncertainty of the EBLUP with the analytical approximation. Our empirical results show that the proposed resampling-based approaches performed better than the analytical approximation in several situations, although in some cases they tend to underestimate the true MSPE of the EBLUP in a higher number of small areas. 相似文献

4.

Benchmarked linear shrinkage prediction in the Fay–Herriot small area model

Kentaro Chikamatsu Tatsuya Kubokawa 《Scandinavian Journal of Statistics》2023,50(2):572-588

The empirical best linear unbiased predictor (EBLUP) is a linear shrinkage of the direct estimate toward the regression estimate and useful for the small area estimation in the sense of increasing precision of estimation of small area means. However, one potential difficulty of EBLUP is that the overall estimate for a larger geographical area based on a sum of EBLUP is not necessarily identical to the corresponding direct estimate like the overall sample mean. To fix this problem, the paper suggests a new method for benchmarking EBLUP in the Fay–Herriot model without assuming normality of random effects and sampling errors. The resulting benchmarked empirical linear shrinkage (BELS) predictor has novelty in the sense that coefficients for benchmarking are adjusted based on the data from each area. To measure the uncertainty of BELS, the second-order unbiased estimator of the mean squared error is derived. 相似文献

5.

Corrected empirical Bayes confidence intervals in nested error regression models

Tatsuya Kubokawa 《Journal of the Korean Statistical Society》2010,39(2):221-236

In the small area estimation, the empirical best linear unbiased predictor (EBLUP) or the empirical Bayes estimator (EB) in the linear mixed model is recognized to be useful because it gives a stable and reliable estimate for a mean of a small area. In practical situations where EBLUP is applied to real data, it is important to evaluate how much EBLUP is reliable. One method for the purpose is to construct a confidence interval based on EBLUP. In this paper, we obtain an asymptotically corrected empirical Bayes confidence interval in a nested error regression model with unbalanced sample sizes and unknown components of variance. The coverage probability is shown to satisfy the confidence level in the second-order asymptotics. It is numerically revealed that the corrected confidence interval is superior to the conventional confidence interval based on the sample mean in terms of the coverage probability and the expected width of the interval. Finally, it is applied to the posted land price data in Tokyo and the neighboring prefecture. 相似文献

6.

Spatial robust small area estimation 总被引：1，自引：0，他引：1

Timo Schmid Ralf T. Münnich 《Statistical Papers》2014,55(3):653-670

The accuracy of recent applications in small area statistics in many cases highly depends on the assumed properties of the underlying models and the availability of micro information. In finite population sampling, small sample sizes may increase the sensitivity of the modeling with respect to single units. In these cases, area-specific sample sizes tend to be small such that normal assumptions, even of area means, seem to be violated. Hence, applying robust estimation methods is expected to yield more reliable results. In general, two robust small area methods are applied, the robust EBLUP and the M-quantile method. Additionally, the use of adequate auxiliary information may further increase the accuracy of the estimates. In prediction based approaches where information is needed on universe level, in general, only few variables are available which can be used for modeling. In addition to variables from the dataset, in many cases further information may be available, e.g. geographical information which could indicate spatial dependencies between neighboring areas. This spatial information can be included in the modeling using spatially correlated area effects. Within the paper the classical robust EBLUP is extended to cover spatial area effects via a simultaneous autoregressive model. The performance of the different estimators are compared in a model-based simulation study. 相似文献

7.

Spatial generalized linear mixed models in small area estimation

Mahmoud Torabi 《Revue canadienne de statistique》2019,47(3):426-437

In survey sampling, policy decisions regarding the allocation of resources to sub‐groups of a population depend on reliable predictors of their underlying parameters. However, in some sub‐groups, called small areas due to small sample sizes relative to the population, the information needed for reliable estimation is typically not available. Consequently, data on a coarser scale are used to predict the characteristics of small areas. Mixed models are the primary tools in small area estimation (SAE) and also borrow information from alternative sources (e.g., previous surveys and administrative and census data sets). In many circumstances, small area predictors are associated with location. For instance, in the case of chronic disease or cancer, it is important for policy makers to understand spatial patterns of disease in order to determine small areas with high risk of disease and establish prevention strategies. The literature considering SAE with spatial random effects is sparse and mostly in the context of spatial linear mixed models. In this article, small area models are proposed for the class of spatial generalized linear mixed models to obtain small area predictors and corresponding second‐order unbiased mean squared prediction errors via Taylor expansion and a parametric bootstrap approach. The performance of the proposed approach is evaluated through simulation studies and application of the models to a real esophageal cancer data set from Minnesota, U.S.A. The Canadian Journal of Statistics 47: 426–437; 2019 © 2019 Statistical Society of Canada 相似文献

8.

Small area estimation under random regression coefficient models

Tomáš Hobza Domingo Morales 《Journal of Statistical Computation and Simulation》2013,83(11):2160-2177

Statistical agencies are interested to report precise estimates of linear parameters from small areas. This goal can be achieved by using model-based inference. In this sense, random regression coefficient models provide a flexible way of modelling the relationship between the target and the auxiliary variables. Because of this, empirical best linear unbiased predictor (EBLUP) estimates based on these models are introduced. A closed-formula procedure to estimate the mean-squared error of the EBLUP estimators is also given and empirically studied. Results of several simulation studies are reported as well as an application to the estimation of household normalized net annual incomes in the Spanish Living Conditions Survey. 相似文献

9.

Copula-based predictions in small area estimation

Kanika Grover Elif F. Acar Mahmoud Torabi 《Revue canadienne de statistique》2020,48(4):685-711

Unit-level regression models are commonly used in small area estimation (SAE) to obtain an empirical best linear unbiased prediction of small area characteristics. The underlying assumptions of these models, however, may be unrealistic in some applications. Previous work developed a copula-based SAE model where the empirical Kendall's tau was used to estimate the dependence between two units from the same area. In this article, we propose a likelihood framework to estimate the intra-class dependence of the multivariate exchangeable copula for the empirical best unbiased prediction (EBUP) of small area means. One appeal of the proposed approach lies in its accommodation of both parametric and semi-parametric estimation approaches. Under each estimation method, we further propose a bootstrap approach to obtain a nearly unbiased estimator of the mean squared prediction error of the EBUP of small area means. The performance of the proposed methods is evaluated through simulation studies and also by a real data application. 相似文献

10.

A pseudo‐empirical best linear unbiased prediction approach to small area estimation using survey weights

Yong You J. N. K. Rao 《Revue canadienne de statistique》2002,30(3):431-439

The authors develop a small area estimation method using a nested error linear regression model and survey weights. In particular, they propose a pseudo‐empirical best linear unbiased prediction (pseudo‐EBLUP) estimator to estimate small area means. This estimator borrows strength across areas through the model and makes use of the survey weights to preserve the design consistency as the area sample size increases. The proposed estimator also has a nice self‐benchmarking property. The authors also obtain an approximation to the model mean squared error (MSE) of the proposed estimator and a nearly unbiased estimator of MSE. Finally, they compare the proposed estimator with the EBLUP estimator and the pseudo‐EBLUP estimator proposed by Prasad & Rao (1999), using data analyzed earlier by Battese, Harter & Fuller (1988). 相似文献

11.

Small Area Estimation for Zero-Inflated Data

Hukum Chandra U. C. Sud 《统计学通讯:模拟与计算》2013,42(5):632-643

The commonly used method of small area estimation (SAE) under a linear mixed model may not be efficient if data contain substantial proportion of zeros than would be expected under standard model assumptions (hereafter zero-inflated data). The authors discuss the SAE for zero-inflated data under a two-part random effects model that account for excess zeros in the data. Empirical results show that proposed method for SAE works well and produces an efficient set of small area estimates. An application to real survey data from the National Sample Survey Office of India demonstrates the satisfactory performance of the method. The authors describe a parametric bootstrap method to estimate the mean squared error (MSE) of the proposed estimator of small areas. The bootstrap estimates of the MSE are compared to the true MSE in simulation study. 相似文献

12.

Small area estimation via heteroscedastic nested‐error regression

Jiming Jiang Thuan Nguyen 《Revue canadienne de statistique》2012,40(3):588-603

We show that the maximum likelihood estimators (MLEs) of the fixed effects and within‐cluster correlation are consistent in a heteroscedastic nested‐error regression (HNER) model with completely unknown within‐cluster variances under mild conditions. The result implies that the empirical best linear unbiased prediction (EBLUP) method for small area estimation is valid in such a case. We also show that ignoring the heteroscedasticity can lead to inconsistent estimation of the within‐cluster correlation and inferior predictive performance. A jackknife measure of uncertainty for the EBLUP is developed under the HNER model. Simulation studies are carried out to investigate the finite‐sample performance of the EBLUP and MLE under the HNER model, with comparisons to those under the nested‐error regression model in various situations, as well as that of the jackknife measure of uncertainty. The well‐known Iowa crops data is used for illustration. The Canadian Journal of Statistics 40: 588–603; 2012 © 2012 Statistical Society of Canada 相似文献

13.

Exploring spatial dependence in area-level random effect model for disaggregate-level crop yield estimation

Hukum Chandra 《Journal of applied statistics》2013,40(4):823-842

This paper describes an application of small area estimation (SAE) techniques under area-level spatial random effect models when only area (or district or aggregated) level data are available. In particular, the SAE approach is applied to produce district-level model-based estimates of crop yield for paddy in the state of Uttar Pradesh in India using the data on crop-cutting experiments supervised under the Improvement of Crop Statistics scheme and the secondary data from the Population Census. The diagnostic measures are illustrated to examine the model assumptions as well as reliability and validity of the generated model-based small area estimates. The results show a considerable gain in precision in model-based estimates produced applying SAE. Furthermore, the model-based estimates obtained by exploiting spatial information are more efficient than the one obtained by ignoring this information. However, both of these model-based estimates are more efficient than the direct survey estimate. In many districts, there is no survey data and therefore it is not possible to produce direct survey estimates for these districts. The model-based estimates generated using SAE are still reliable for such districts. These estimates produced by using SAE will provide invaluable information to policy-analysts and decision-makers. 相似文献

14.

Small area estimation of proportions under a spatial dependent aggregated level random effects model

Hukum Chandra Nicola Salvati 《统计学通讯:理论与方法》2018,47(5):1234-1255

This paper describes small area estimation (SAE) of proportions under a spatial dependent generalized linear mixed model using aggregated level data. The SAE is also applied to produce reliable district level estimates and mapping of incidence of indebtedness in the State of Uttar Pradesh in India using debt and investment survey data collected by National Sample Survey Office (NSSO) and the secondary data from the Census. The results show a significant improvement in precision of model-based estimates generated by SAE as compared to direct estimates. The estimates generated by incorporating spatial information are more efficient than the one generated by ignoring this information. 相似文献

15.

An objective stepwise Bayes approach to small area estimation

Yanping Qu Bo Zhang 《Journal of Statistical Computation and Simulation》2015,85(7):1474-1494

The term ‘small area’ or ‘small domain’ is commonly used to denote a small geographical area that has a small subpopulation of people within a large area. Small area estimation is an important area in survey sampling because of the growing demand for better statistical inference for small areas in public or private surveys. In small area estimation problems the focus is on how to borrow strength across areas in order to develop a reliable estimator and which makes use of available auxiliary information. Some traditional methods for small area problems such as empirical best linear unbiased prediction borrow strength through linear models that provide links to related areas, which may not be appropriate for some survey data. In this article, we propose a stepwise Bayes approach which borrows strength through an objective posterior distribution. This approach results in a generalized constrained Dirichlet posterior estimator when auxiliary information is available for small areas. The objective posterior distribution is based only on the assumption of exchangeability across related areas and does not make any explicit model assumptions. The form of our posterior distribution allows us to assign a weight to each member of the sample. These weights can then be used in a straight forward fashion to make inferences about the small area means. Theoretically, the stepwise Bayes character of the posterior allows one to prove the admissibility of the point estimators suggesting that inferential procedures based on this approach will tend to have good frequentist properties. Numerically, we demonstrate in simulations that the proposed stepwise Bayes approach can have substantial strengths compared to traditional methods. 相似文献

16.

Small area estimation strategies for large population surveys: a comparison of design and model-based methods

Zhaonan Li Xinyi Xu 《Journal of Statistical Computation and Simulation》2017,87(4):817-833

Small area estimation (SAE) concerns with how to reliably estimate population quantities of interest when some areas or domains have very limited samples. This is an important issue in large population surveys, because the geographical areas or groups with only small samples or even no samples are often of interest to researchers and policy-makers. For example, large population health surveys, such as Behavioural Risk Factor Surveillance System and Ohio Mecaid Assessment Survey (OMAS), are regularly conducted for monitoring insurance coverage and healthcare utilization. Classic approaches usually provide accurate estimators at the state level or large geographical region level, but they fail to provide reliable estimators for many rural counties where the samples are sparse. Moreover, a systematic evaluation of the performances of the SAE methods in real-world setting is lacking in the literature. In this paper, we propose a Bayesian hierarchical model with constraints on the parameter space and show that it provides superior estimators for county-level adult uninsured rates in Ohio based on the 2012 OMAS data. Furthermore, we perform extensive simulation studies to compare our methods with a collection of common SAE strategies, including direct estimators, synthetic estimators, composite estimators, and Datta GS, Ghosh M, Steorts R, Maples J.'s [Bayesian benchmarking with applications to small area estimation. Test 2011;20(3):574–588] Bayesian hierarchical model-based estimators. To set a fair basis for comparison, we generate our simulation data with characteristics mimicking the real OMAS data, so that neither model-based nor design-based strategies use the true model specification. The estimators based on our proposed model are shown to outperform other estimators for small areas in both simulation study and real data analysis. 相似文献

17.

Small area estimation of proportions in business surveys

《Journal of Statistical Computation and Simulation》2012,82(6):783-795

Binary data are often of interest in business surveys, particularly when the aim is to characterize grouping in the businesses making up the survey population. When small area estimates are required for such binary data, use of standard estimation methods based on linear mixed models (LMMs) becomes problematic. We explore two model-based techniques of small area estimation for small area proportions, the empirical best predictor (EBP) under a generalized linear mixed model and the model-based direct estimator (MBDE) under a population-level LMM. Our empirical results show that both the MBDE and the EBP perform well. The EBP is a computationally intensive method, whereas the MBDE is easy to implement. In case of model misspecification, the MBDE also appears to be more robust. The mean-squared error (MSE) estimation of MBDE is simple and straightforward, which is in contrast to the complicated MSE estimation for the EBP. 相似文献

18.

Semi‐parametric small‐area estimation by combining time‐series and cross‐sectional data methods

下载免费PDF全文

Farhad Shokoohi Mahmoud Torabi 《Australian & New Zealand Journal of Statistics》2018,60(3):323-342

In survey sampling, policymaking regarding the allocation of resources to subgroups (called small areas) or the determination of subgroups with specific properties in a population should be based on reliable estimates. Information, however, is often collected at a different scale than that of these subgroups; hence, the estimation can only be obtained on finer scale data. Parametric mixed models are commonly used in small‐area estimation. The relationship between predictors and response, however, may not be linear in some real situations. Recently, small‐area estimation using a generalised linear mixed model (GLMM) with a penalised spline (P‐spline) regression model, for the fixed part of the model, has been proposed to analyse cross‐sectional responses, both normal and non‐normal. However, there are many situations in which the responses in small areas are serially dependent over time. Such a situation is exemplified by a data set on the annual number of visits to physicians by patients seeking treatment for asthma, in different areas of Manitoba, Canada. In cases where covariates that can possibly predict physician visits by asthma patients (e.g. age and genetic and environmental factors) may not have a linear relationship with the response, new models for analysing such data sets are required. In the current work, using both time‐series and cross‐sectional data methods, we propose P‐spline regression models for small‐area estimation under GLMMs. Our proposed model covers both normal and non‐normal responses. In particular, the empirical best predictors of small‐area parameters and their corresponding prediction intervals are studied with the maximum likelihood estimation approach being used to estimate the model parameters. The performance of the proposed approach is evaluated using some simulations and also by analysing two real data sets (precipitation and asthma). 相似文献

19.

Functional Mixed Effects Model for Small Area Estimation

下载免费PDF全文

Tapabrata Maiti Samiran Sinha Ping‐Shou Zhong 《Scandinavian Journal of Statistics》2016,43(3):886-903

Functional data analysis has become an important area of research because of its ability of handling high‐dimensional and complex data structures. However, the development is limited in the context of linear mixed effect models and, in particular, for small area estimation. The linear mixed effect models are the backbone of small area estimation. In this article, we consider area‐level data and fit a varying coefficient linear mixed effect model where the varying coefficients are semiparametrically modelled via B‐splines. We propose a method of estimating the fixed effect parameters and consider prediction of random effects that can be implemented using a standard software. For measuring prediction uncertainties, we derive an analytical expression for the mean squared errors and propose a method of estimating the mean squared errors. The procedure is illustrated via a real data example, and operating characteristics of the method are judged using finite sample simulation studies. 相似文献

20.

Bootstrap mean squared error of a small-area EBLUP

《Journal of Statistical Computation and Simulation》2012,82(5):443-462

Concerning the estimation of linear parameters in small areas, a nested-error regression model is assumed for the values of the target variable in the units of a finite population. Then, a bootstrap procedure is proposed for estimating the mean squared error (MSE) of the EBLUP under the finite population setup. The consistency of the bootstrap procedure is studied, and a simulation experiment is carried out in order to compare the performance of two different bootstrap estimators with the approximation given by Prasad and Rao [Prasad, N.G.N. and Rao, J.N.K., 1990, The estimation of the mean squared error of small-area estimators. Journal of the American Statistical Association, 85, 163–171.]. In the numerical results, one of the bootstrap estimators shows a better bias behavior than the Prasad–Rao approximation for some of the small areas and not much worse in any case. Further, it shows less MSE in situations of moderate heteroscedasticity and under mispecification of the error distribution as normal when the true distribution is logistic or Gumbel. The proposed bootstrap method can be applied to more general types of parameters (linear of not) and predictors. 相似文献