首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 27 毫秒
1.
Hierarchical models are popular in many applied statistics fields including Small Area Estimation. One well known model employed in this particular field is the Fay–Herriot model, in which unobservable parameters are assumed to be Gaussian. In Hierarchical models assumptions about unobservable quantities are difficult to check. For a special case of the Fay–Herriot model, Sinharay and Stern [2003. Posterior predictive model checking in Hierarchical models. J. Statist. Plann. Inference 111, 209–221] showed that violations of the assumptions about the random effects are difficult to detect using posterior predictive checks. In this present paper we consider two extensions of the Fay–Herriot model in which the random effects are assumed to be distributed according to either an exponential power (EP) distribution or a skewed EP distribution. We aim to explore the robustness of the Fay–Herriot model for the estimation of individual area means as well as the empirical distribution function of their ‘ensemble’. Our findings, which are based on a simulation experiment, are largely consistent with those of Sinharay and Stern as far as the efficient estimation of individual small area parameters is concerned. However, when estimating the empirical distribution function of the ‘ensemble’ of small area parameters, results are more sensitive to the failure of distributional assumptions.  相似文献   

2.
Scientific experiments commonly result in clustered discrete and continuous data. Existing methods for analyzing such data include the use of quasi-likelihood procedures and generalized estimating equations to estimate marginal mean response parameters. In applications to areas such as developmental toxicity studies, where discrete and continuous measurements are recorded on each fetus, or clinical ophthalmologic trials, where different types of observations are made on each eye, the assumption that data within cluster are exchangeable is often very reasonable. We use this assumption to formulate fully parametric regression models for clusters of bivariate data with binary and continuous components. The regression models proposed have marginal interpretations and reproducible model structures. Tractable expressions for likelihood equations are derived and iterative schemes are given for computing efficient estimates (MLEs) of the marginal mean, correlations, variances and higher moments. We demonstrate the use the ‘exchangeable’ procedure with an application to a developmental toxicity study involving fetal weight and malformation data.  相似文献   

3.
M-quantile models with application to poverty mapping   总被引:1,自引:0,他引:1  
Over the last decade there has been growing demand for estimates of population characteristics at small area level. Unfortunately, cost constraints in the design of sample surveys lead to small sample sizes within these areas and as a result direct estimation, using only the survey data, is inappropriate since it yields estimates with unacceptable levels of precision. Small area models are designed to tackle the small sample size problem. The most popular class of models for small area estimation is random effects models that include random area effects to account for between area variations. However, such models also depend on strong distributional assumptions, require a formal specification of the random part of the model and do not easily allow for outlier robust inference. An alternative approach to small area estimation that is based on the use of M-quantile models was recently proposed by Chambers and Tzavidis (Biometrika 93(2):255–268, 2006) and Tzavidis and Chambers (Robust prediction of small area means and distributions. Working paper, 2007). Unlike traditional random effects models, M-quantile models do not depend on strong distributional assumption and automatically provide outlier robust inference. In this paper we illustrate for the first time how M-quantile models can be practically employed for deriving small area estimates of poverty and inequality. The methodology we propose improves the traditional poverty mapping methods in the following ways: (a) it enables the estimation of the distribution function of the study variable within the small area of interest both under an M-quantile and a random effects model, (b) it provides analytical, instead of empirical, estimation of the mean squared error of the M-quantile small area mean estimates and (c) it employs a robust to outliers estimation method. The methodology is applied to data from the 2002 Living Standards Measurement Survey (LSMS) in Albania for estimating (a) district level estimates of the incidence of poverty in Albania, (b) district level inequality measures and (c) the distribution function of household per-capita consumption expenditure in each district. Small area estimates of poverty and inequality show that the poorest Albanian districts are in the mountainous regions (north and north east) with the wealthiest districts, which are also linked with high levels of inequality, in the coastal (south west) and southern part of country. We discuss the practical advantages of our methodology and note the consistency of our results with results from previous studies. We further demonstrate the usefulness of the M-quantile estimation framework through design-based simulations based on two realistic survey data sets containing small area information and show that the M-quantile approach may be preferable when the aim is to estimate the small area distribution function.  相似文献   

4.
Small‐area estimation of poverty‐related variables is an increasingly important analytical tool in targeting the delivery of food and other aid in developing countries. We compare two methods for the estimation of small‐area means and proportions, namely empirical Bayes and composite estimation, with what has become the international standard method of Elbers, Lanjouw & Lanjouw (2003) . In addition to differences among the sets of estimates and associated estimated standard errors, we discuss data requirements, design and model selection issues and computational complexity. The Elbers, Lanjouw and Lanjouw (ELL) method is found to produce broadly similar estimates but to have much smaller estimated standard errors than the other methods. The question of whether these standard error estimates are downwardly biased is discussed. Although the question cannot yet be answered in full, as a precautionary measure it is strongly recommended that the ELL model be modified to include a small‐area‐level error component in addition to the cluster‐level and household‐level errors it currently contains. This recommendation is particularly important because the allocation of billions of dollars of aid funding is being determined and monitored via ELL. Under current aid distribution mechanisms, any downward bias in estimates of standard error may lead to allocations that are suboptimal because distinctions are made between estimated poverty levels at the small‐area level that are not significantly different statistically.  相似文献   

5.
Calibration techniques in survey sampling, such as generalized regression estimation (GREG), were formalized in the 1990s to produce efficient estimators of linear combinations of study variables, such as totals or means. They implicitly lie on the assumption of a linear regression model between the variable of interest and some auxiliary variables in order to yield estimates with lower variance if the model is true and remaining approximately design-unbiased even if the model does not hold. We propose a new class of model-assisted estimators obtained by releasing a few calibration constraints and replacing them with a penalty term. This penalization is added to the distance criterion to minimize. By introducing the concept of penalized calibration, combining usual calibration and this ‘relaxed’ calibration, we are able to adjust the weight given to the available auxiliary information. We obtain a more flexible estimation procedure giving better estimates particularly when the auxiliary information is overly abundant or not fully appropriate to be completely used. Such an approach can also be seen as a design-based alternative to the estimation procedures based on the more general class of mixed models, presenting new prospects in some scopes of application such as inference on small domains.  相似文献   

6.
Small area estimation plays a prominent role in survey sampling due to a growing demand for reliable small area estimates from both public and private sectors. Popularity of model-based inference is increasing in survey sampling, particularly, in small area estimation. The estimates of the small area parameters can profitably ‘borrow strength’ from data on related multiple characteristics and/or auxiliary variables from other neighboring areas through appropriate models. Fay (1987, Small Area Statistics, Wiley, New York, pp. 91–102) proposed multivariate regression for small area estimation of multiple characteristics. The success of this modeling rests essentially on the strength of correlation of these dependent variables. To estimate small area mean vectors of multiple characteristics, multivariate modeling has been proposed in the literature via a multivariate variance components model. We use this approach to empirical best linear unbiased and empirical Bayes prediction of small area mean vectors. We use data from Battese et al. (1988, J. Amer. Statist. Assoc. 83, 28 –36) to conduct a simulation which shows that the multivariate approach may achieve substantial improvement over the usual univariate approach.  相似文献   

7.
Sample surveys are usually designed and analyzed to produce estimates for larger areas and/or populations. Nevertheless, sample sizes are often not large enough to give adequate precision for small area estimates of interest. To circumvent such difficulties, borrowing strength from related small areas via modeling becomes essential. In line with this, we propose a hierarchical multivariate Bayes prediction method for small area estimation based on the seemingly unrelated regressions (SUR) model. The performance of the proposed method was evaluated through simulation studies.  相似文献   

8.
Angling from small recreational fishing boats was used as a sampling method to quantify the relative density of snapper ( Pagrus auratus ) in six areas within the Cape Rodney-Okakari Point Marine Reserve (New Zealand) and four areas adjacent to the reserve. Penalized quasi-likelihood was used to fit a log-linear mixed-effects model having area and date as fixed effects and boat as a random effect. Simulation and first-order bias correction formulae were employed to assess the validity of the estimates of the area effects. The bias correction is known to be unsuitable for general use because it typically over-estimatesbias, and this was observed here. However, it was qualitatively useful for indicating the direction of bias and for indicating when estimators were approximately unbiased. The parameter of primary interest was the ratio of snapper density in the marine reserve versus snapper density outside the reserve, and the estimator of this parameter was first-order asymptotically unbiased. This ratio of snapper densities was estimated to be 11 (±3).  相似文献   

9.
In brain mapping, the regions of the brain that are ‘activated’ by a task or external stimulus are detected by thresholding an image of test statistics. Often the experiment is repeated on several different subjects or for several different stimuli on the same subject, and the researcher is interested in the common points in the brain where ‘activation’ occurs in all test statistic images. The conjunction is thus defined as those points in the brain that show ‘activation’ in all images. We are interested in which parts of the conjunction are noise, and which show true activation in all test statistic images. We would expect truly activated regions to be larger than usual, so our test statistic is based on the volume of clusters (connected components) of the conjunction. Our main result is an approximate P-value for this in the case of the conjunction of two Gaussian test statistic images. The results are applied to a functional magnetic resonance experiment in pain perception.  相似文献   

10.
The empirical best linear unbiased prediction approach is a popular method for the estimation of small area parameters. However, the estimation of reliable mean squared prediction error (MSPE) of the estimated best linear unbiased predictors (EBLUP) is a complicated process. In this paper we study the use of resampling methods for MSPE estimation of the EBLUP. A cross-sectional and time-series stationary small area model is used to provide estimates in small areas. Under this model, a parametric bootstrap procedure and a weighted jackknife method are introduced. A Monte Carlo simulation study is conducted in order to compare the performance of different resampling-based measures of uncertainty of the EBLUP with the analytical approximation. Our empirical results show that the proposed resampling-based approaches performed better than the analytical approximation in several situations, although in some cases they tend to underestimate the true MSPE of the EBLUP in a higher number of small areas.  相似文献   

11.
The commonly used method of small area estimation (SAE) under a linear mixed model may not be efficient if data contain substantial proportion of zeros than would be expected under standard model assumptions (hereafter zero-inflated data). The authors discuss the SAE for zero-inflated data under a two-part random effects model that account for excess zeros in the data. Empirical results show that proposed method for SAE works well and produces an efficient set of small area estimates. An application to real survey data from the National Sample Survey Office of India demonstrates the satisfactory performance of the method. The authors describe a parametric bootstrap method to estimate the mean squared error (MSE) of the proposed estimator of small areas. The bootstrap estimates of the MSE are compared to the true MSE in simulation study.  相似文献   

12.
Many quantities arising in non-life insurance depend on claim severity distributions, which are usually modeled assuming a parametric form. Obtaining good estimates of the quantities, therefore, reduces to having good estimates of the model parameters. However, the notion of ‘good estimate’ depends on the problem at hand. For example, the maximum likelihood estimators (MLEs) are efficient, but they generally lack robustness. Since outliers are common in insurance loss data, it is therefore important to have a method that allows one to balance between efficiency and robustness. Guided by this philosophy, in the present paper we suggest a general estimation method that we call the method of trimmed moments (MTM). This method is appropriate for various model-fitting situations including those for which a close fit in one or both tails of the distribution is not required. The MTM estimators can achieve various degrees of robustness, and they also allow the decision maker to easily see the actions of the estimators on the data, which makes them particularly appealing. We illustrate these features with detailed theoretical analyses and simulation studies of the MTM estimators in the case of location–scale families and several loss distributions such as lognormal and Pareto. As a further illustration, we analyze a real data set concerning hurricane damages in the United States from 1925 to 1995.  相似文献   

13.
We compare minimum Hellinger distance and minimum Heiiinger disparity estimates for U-shaped beta distributions. Given suitable density estimates, both methods are known to be asymptotically efficient when the data come from the assumed model family, and robust to small perturbations from the model family. Most implementations use kernel density estimates, which may not be appropriate for U-shaped distributions. We compare fixed binwidth histograms, percentile mesh histograms, and averaged shifted histograms. Minimum disparity estimates are less sensitive to the choice of density estimate than are minimum distance estimates, and the percentile mesh histogram gives the best results for both minimum distance and minimum disparity estimates. Minimum distance estimates are biased and a bias-corrected method is proposed. Minimum disparity estimates and bias-corrected minimum distance estimates are comparable to maximum likelihood estimates when the model holds, and give better results than either method of moments or maximum likelihood when the data are discretized or contaminated, Although our re¬sults are for the beta density, the implementations are easily modified for other U-shaped distributions such as the Dirkhlet or normal generated distribution.  相似文献   

14.
A framework for progressively improving small area population estimates   总被引:1,自引:0,他引:1  
Summary.  The paper presents a framework for small area population estimation that enables users to select a method that is fit for the purpose. The adjustments to input data that are needed before use are outlined, with emphasis on developing consistent time series of inputs. We show how geographical harmonization of small areas, which is crucial to comparisons over time, can be achieved. For two study regions, the East of England and Yorkshire and the Humber, the differences in output and consequences of adopting different methods are illustrated. The paper concludes with a discussion of how data, on stream since 1998, might be included in future small area estimates.  相似文献   

15.
We propose a new procedure for combining multiple tests in samples of right-censored observations. The new method is based on multiple constrained censored empirical likelihood where the constraints are formulated as linear functionals of the cumulative hazard functions. We prove a version of Wilks’ theorem for the multiple constrained censored empirical likelihood ratio, which provides a simple reference distribution for the test statistic of our proposed method. A useful application of the proposed method is, for example, examining the survival experience of different populations by combining different weighted log-rank tests. Real data examples are given using the log-rank and Gehan-Wilcoxon tests. In a simulation study of two sample survival data, we compare the proposed method of combining tests to previously developed procedures. The results demonstrate that, in addition to its computational simplicity, the combined test performs comparably to, and in some situations more reliably than previously developed procedures. Statistical software is available in the R package ‘emplik’.  相似文献   

16.
Including time-varying covariates is a popular extension to the Cox model and a suitable approach for dealing with non-proportional hazards. However, partial likelihood (PL) estimation of this model has three shortcomings: (i) estimated regression coefficients can be less accurate in small samples with heavy censoring; (ii) the baseline hazard is not directly estimated and (iii) a covariance matrix for both the regression coefficients and the baseline hazard is not easily produced.We address these by developing a maximum likelihood (ML) approach to jointly estimate regression coefficients and baseline hazard using a constrained optimisation ensuring the latter''s non-negativity. We demonstrate asymptotic properties of these estimates and show via simulation their increased accuracy compared to PL estimates in small samples and show our method produces smoother baseline hazard estimates than the Breslow estimator.Finally, we apply our method to two examples, including an important real-world financial example to estimate time to default for retail home loans. We demonstrate using our ML estimate for the baseline hazard can give much clearer corroboratory evidence of the ‘humped hazard’, whereby the risk of loan default rises to a peak and then later falls.  相似文献   

17.
We consider a Bayesian nonignorable model to accommodate a nonignorable selection mechanism for predicting small area proportions. Our main objective is to extend a model on selection bias in a previously published paper, coauthored by four authors, to accommodate small areas. These authors assume that the survey weights (or their reciprocals that we also call selection probabilities) are available, but there is no simple relation between the binary responses and the selection probabilities. To capture the nonignorable selection bias within each area, they assume that the binary responses and the selection probabilities are correlated. To accommodate the small areas, we extend their model to a hierarchical Bayesian nonignorable model and we use Markov chain Monte Carlo methods to fit it. We illustrate our methodology using a numerical example obtained from data on activity limitation in the U.S. National Health Interview Survey. We also perform a simulation study to assess the effect of the correlation between the binary responses and the selection probabilities.  相似文献   

18.
Small area estimators in linear models are typically expressed as a convex combination of direct estimators and synthetic estimators from a suitable model. When auxiliary information used in the model is measured with error, a new estimator, accounting for the measurement error in the covariates, has been proposed in the literature. Recently, for area‐level model, Ybarra & Lohr (Biometrika, 95, 2008, 919) suggested a suitable modification to the estimates of small area means based on Fay & Herriot (J. Am. Stat. Assoc., 74, 1979, 269) model where some of the covariates are measured with error. They used a frequentist approach based on the method of moments. Adopting a Bayesian approach, we propose to rewrite the measurement error model as a hierarchical model; we use improper non‐informative priors on the model parameters and show, under a mild condition, that the joint posterior distribution is proper and the marginal posterior distributions of the model parameters have finite variances. We conduct a simulation study exploring different scenarios. The Bayesian predictors we propose show smaller empirical mean squared errors than the frequentist predictors of Ybarra & Lohr (Biometrika, 95, 2008, 919), and they seem also to be more stable in terms of variability and bias. We apply the proposed methodology to two real examples.  相似文献   

19.
Large governmental surveys typically provide accurate national statistics. To decrease the mean squared error of estimates for small areas, i.e., domains in which the sample size is small, auxiliary variables from administrative records are often used as covariates in a mixed linear model. It is generally assumed that the auxiliary information is available for every small area. In many cases, though, such information is available for only some of the small areas, either from another survey or from a previous administration of the same survey. The authors propose and study small area estimators that use multivariate models to combine information from several surveys. They discuss computational algorithms, and a simulation study indicates that if quantities in the different surveys are sufficiently correlated, substantial gains in efficiency can be achieved.  相似文献   

20.
Abstract

Linear mixed effects models have been popular in small area estimation problems for modeling survey data when the sample size in one or more areas is too small for reliable inference. However, when the data are restricted to a bounded interval, the linear model may be inappropriate, particularly if the data are near the boundary. Nonlinear sampling models are becoming increasingly popular for small area estimation problems when the normal model is inadequate. This paper studies the use of a beta distribution as an alternative to the normal distribution as a sampling model for survey estimates of proportions which take values in (0, 1). Inference for small area proportions based on the posterior distribution of a beta regression model ensures that point estimates and credible intervals take values in (0, 1). Properties of a hierarchical Bayesian small area model with a beta sampling distribution and logistic link function are presented and compared to those of the linear mixed effect model. Propriety of the posterior distribution using certain noninformative priors is shown, and behavior of the posterior mean as a function of the sampling variance and the model variance is described. An example using 2010 Small Area Income and Poverty Estimates (SAIPE) data is given, and a numerical example studying small sample properties of the model is presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号