期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Small Area Estimation Using Estimated Population Level Auxiliary Data

Hukum Chandra U. C. Sud Yogita Gharde 《统计学通讯:模拟与计算》2015,44(5):1197-1209

Unit level linear mixed models are often used in small area estimation (SAE), and the empirical best linear unbiased prediction (EBLUP) is widely used for the estimation of small area means under such models. However, EBLUP requires population level auxiliary data, atleast area specific aggregated values. Sometimes population level auxiliary data is either not available or not consistent with the survey data. We describe a SAE method that uses estimated population auxiliary information. Empirical results show that proposed method for SAE produces an efficient set of small area estimates. 相似文献

2.

Untangle the structural and random zeros in statistical modelings

W. Tang W.J. Wang D.G. Chen 《Journal of applied statistics》2018,45(9):1714-1733

Count data with structural zeros are common in public health applications. There are considerable researches focusing on zero-inflated models such as zero-inflated Poisson (ZIP) and zero-inflated Negative Binomial (ZINB) models for such zero-inflated count data when used as response variable. However, when such variables are used as predictors, the difference between structural and random zeros is often ignored and may result in biased estimates. One remedy is to include an indicator of the structural zero in the model as a predictor if observed. However, structural zeros are often not observed in practice, in which case no statistical method is available to address the bias issue. This paper is aimed to fill this methodological gap by developing parametric methods to model zero-inflated count data when used as predictors based on the maximum likelihood approach. The response variable can be any type of data including continuous, binary, count or even zero-inflated count responses. Simulation studies are performed to assess the numerical performance of this new approach when sample size is small to moderate. A real data example is also used to demonstrate the application of this method. 相似文献

3.

Exploring spatial dependence in area-level random effect model for disaggregate-level crop yield estimation

Hukum Chandra 《Journal of applied statistics》2013,40(4):823-842

This paper describes an application of small area estimation (SAE) techniques under area-level spatial random effect models when only area (or district or aggregated) level data are available. In particular, the SAE approach is applied to produce district-level model-based estimates of crop yield for paddy in the state of Uttar Pradesh in India using the data on crop-cutting experiments supervised under the Improvement of Crop Statistics scheme and the secondary data from the Population Census. The diagnostic measures are illustrated to examine the model assumptions as well as reliability and validity of the generated model-based small area estimates. The results show a considerable gain in precision in model-based estimates produced applying SAE. Furthermore, the model-based estimates obtained by exploiting spatial information are more efficient than the one obtained by ignoring this information. However, both of these model-based estimates are more efficient than the direct survey estimate. In many districts, there is no survey data and therefore it is not possible to produce direct survey estimates for these districts. The model-based estimates generated using SAE are still reliable for such districts. These estimates produced by using SAE will provide invaluable information to policy-analysts and decision-makers. 相似文献

4.

Mean-Squared errors of small area estimators under a multivariate linear model for repeated measures data

Innocent Ngaruye Dietrich Von Rosen Martin Singull 《统计学通讯:理论与方法》2019,48(8):2060-2073

In this paper, we discuss the derivation of the first and second moments for the proposed small area estimators under a multivariate linear model for repeated measures data. The aim is to use these moments to estimate the mean-squared errors (MSE) for the predicted small area means as a measure of precision. At the first stage, we derive the MSE when the covariance matrices are known. At the second stage, a method based on parametric bootstrap is proposed for bias correction and for prediction error that reflects the uncertainty when the unknown covariance is replaced by its suitable estimator. 相似文献

5.

Likelihood estimation for longitudinal zero-inflated power series regression models

E. Bahrami Samani Y. Amirian M. Ganjali 《Journal of applied statistics》2012,39(9):1965-1974

In this paper, a zero-inflated power series regression model for longitudinal count data with excess zeros is presented. We demonstrate how to calculate the likelihood for such data when it is assumed that the increment in the cumulative total follows a discrete distribution with a location parameter that depends on a linear function of explanatory variables. Simulation studies indicate that this method can provide improvements in obtaining standard errors of the estimates. We also calculate the dispersion index for this model. The influence of a small perturbation of the dispersion index of the zero-inflated model on likelihood displacement is also studied. The zero-inflated negative binomial regression model is illustrated on data regarding joint damage in psoriatic arthritis. 相似文献

6.

Small area estimation of proportions under a spatial dependent aggregated level random effects model

Hukum Chandra Nicola Salvati 《统计学通讯:理论与方法》2018,47(5):1234-1255

This paper describes small area estimation (SAE) of proportions under a spatial dependent generalized linear mixed model using aggregated level data. The SAE is also applied to produce reliable district level estimates and mapping of incidence of indebtedness in the State of Uttar Pradesh in India using debt and investment survey data collected by National Sample Survey Office (NSSO) and the secondary data from the Census. The results show a significant improvement in precision of model-based estimates generated by SAE as compared to direct estimates. The estimates generated by incorporating spatial information are more efficient than the one generated by ignoring this information. 相似文献

7.

Copula-based predictions in small area estimation

Kanika Grover Elif F. Acar Mahmoud Torabi 《Revue canadienne de statistique》2020,48(4):685-711

Unit-level regression models are commonly used in small area estimation (SAE) to obtain an empirical best linear unbiased prediction of small area characteristics. The underlying assumptions of these models, however, may be unrealistic in some applications. Previous work developed a copula-based SAE model where the empirical Kendall's tau was used to estimate the dependence between two units from the same area. In this article, we propose a likelihood framework to estimate the intra-class dependence of the multivariate exchangeable copula for the empirical best unbiased prediction (EBUP) of small area means. One appeal of the proposed approach lies in its accommodation of both parametric and semi-parametric estimation approaches. Under each estimation method, we further propose a bootstrap approach to obtain a nearly unbiased estimator of the mean squared prediction error of the EBUP of small area means. The performance of the proposed methods is evaluated through simulation studies and also by a real data application. 相似文献

8.

Zero-spiked regression models generated by gamma random variables with application in the resin oil production

Elizabeth M. Hashimoto Gauss M. Cordeiro Vicente G. Cancho Carine Klauberg 《Journal of Statistical Computation and Simulation》2019,89(1):52-70

Zero-inflated data are more frequent when the data represent counts. However, there are practical situations in which continuous data contain an excess of zeros. In these cases, the zero-inflated Poisson, binomial or negative binomial models are not suitable. In order to reduce this gap, we propose the zero-spiked gamma-Weibull (ZSGW) model by mixing a distribution which is degenerate at zero with the gamma-Weibull distribution, which has positive support. The model attempts to estimate simultaneously the effects of explanatory variables on the response variable and the zero-spiked. We consider a frequentist analysis and a non-parametric bootstrap for estimating the parameters of the ZSGW regression model. We derive the appropriate matrices for assessing local influence on the model parameters. We illustrate the performance of the proposed regression model by means of a real data set (copaiba oil resin production) from a study carried out at the Department of Forest Science of the Luiz de Queiroz School of Agriculture, University of São Paulo. Based on the ZSGW regression model, we determine the explanatory variables that can influence the excess of zeros of the resin oil production and identify influential observations. We also prove empirically that the proposed regression model can be superior to the zero-adjusted inverse Gaussian regression model to fit zero-inflated positive continuous data. 相似文献

9.

GEE-based zero-inflated generalized Poisson model for clustered over or under-dispersed count data

Fatemeh Sarvi Hossein Mahjub 《Journal of Statistical Computation and Simulation》2019,89(14):2711-2732

The zero-inflated regression models such as zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB) or zero-inflated generalized Poisson (ZIGP) regression models can model the count data with excess zeros. The ZINB model can handle over-dispersed and the ZIGP model can handle the over or under-dispersed count data with excess zeros as well. Moreover, the count data may be correlated because of data collection procedure or special study design. The clustered sampling approach is one of the examples in which the correlation among subjects could be defined. In such situations, a marginal model using generalized estimating equation (GEE) approach can incorporate these correlations and lead up to the relationships at the population level. In this study, the GEE-based zero-inflated generalized Poisson regression model was proposed to fit over and under-dispersed clustered count data with excess zeros. 相似文献

10.

On Measuring Uncertainty of Benchmarked Predictors with Application to Disease Risk Estimate

Tatsuya Kubokawa Mana Hasukawa Kunihiko Takahashi 《Scandinavian Journal of Statistics》2014,41(2):394-413

Empirical Bayes (EB) estimates in general linear mixed models are useful for the small area estimation in the sense of increasing precision of estimation of small area means. However, one potential difficulty of EB is that the overall estimate for a larger geographical area based on a (weighted) sum of EB estimates is not necessarily identical to the corresponding direct estimate such as the overall sample mean. Another difficulty is that EB estimates yield over‐shrinking, which results in the sampling variance smaller than the posterior variance. One way to fix these problems is the benchmarking approach based on the constrained empirical Bayes (CEB) estimators, which satisfy the constraints that the aggregated mean and variance are identical to the requested values of mean and variance. In this paper, we treat the general mixed models, derive asymptotic approximations of the mean squared error (MSE) of CEB and provide second‐order unbiased estimators of MSE based on the parametric bootstrap method. These results are applied to natural exponential families with quadratic variance functions. As a specific example, the Poisson‐gamma model is dealt with, and it is illustrated that the CEB estimates and their MSE estimates work well through real mortality data. 相似文献

11.

Estimation of Median in Two-Phase Sampling Using Two Auxiliary Variables

Sat Gupta Javid Shabbir Shabbir Ahmad 《统计学通讯:理论与方法》2013,42(11):1815-1822

In recent years, zero-inflated count data models, such as zero-inflated Poisson (ZIP) models, are widely used as the count data with extra zeros are very common in many practical problems. In order to model the correlated count data which are either clustered or repeated and to assess the effects of continuous covariates or of time scales in a flexible way, a class of semiparametric mixed-effects models for zero-inflated count data is considered. In this article, we propose a fully Bayesian inference for such models based on a data augmentation scheme that reflects both random effects of covariates and mixture of zero-inflated distribution. A computational efficient MCMC method which combines the Gibbs sampler and M-H algorithm is implemented to obtain the estimate of the model parameters. Finally, a simulation study and a real example are used to illustrate the proposed methodologies. 相似文献

12.

A pseudo‐empirical best linear unbiased prediction approach to small area estimation using survey weights

Yong You J. N. K. Rao 《Revue canadienne de statistique》2002,30(3):431-439

The authors develop a small area estimation method using a nested error linear regression model and survey weights. In particular, they propose a pseudo‐empirical best linear unbiased prediction (pseudo‐EBLUP) estimator to estimate small area means. This estimator borrows strength across areas through the model and makes use of the survey weights to preserve the design consistency as the area sample size increases. The proposed estimator also has a nice self‐benchmarking property. The authors also obtain an approximation to the model mean squared error (MSE) of the proposed estimator and a nearly unbiased estimator of MSE. Finally, they compare the proposed estimator with the EBLUP estimator and the pseudo‐EBLUP estimator proposed by Prasad & Rao (1999), using data analyzed earlier by Battese, Harter & Fuller (1988). 相似文献

13.

Bootstrap mean squared error of a small-area EBLUP

《Journal of Statistical Computation and Simulation》2012,82(5):443-462

Concerning the estimation of linear parameters in small areas, a nested-error regression model is assumed for the values of the target variable in the units of a finite population. Then, a bootstrap procedure is proposed for estimating the mean squared error (MSE) of the EBLUP under the finite population setup. The consistency of the bootstrap procedure is studied, and a simulation experiment is carried out in order to compare the performance of two different bootstrap estimators with the approximation given by Prasad and Rao [Prasad, N.G.N. and Rao, J.N.K., 1990, The estimation of the mean squared error of small-area estimators. Journal of the American Statistical Association, 85, 163–171.]. In the numerical results, one of the bootstrap estimators shows a better bias behavior than the Prasad–Rao approximation for some of the small areas and not much worse in any case. Further, it shows less MSE in situations of moderate heteroscedasticity and under mispecification of the error distribution as normal when the true distribution is logistic or Gumbel. The proposed bootstrap method can be applied to more general types of parameters (linear of not) and predictors. 相似文献

14.

Marginalized zero-inflated generalized Poisson regression

Felix Famoye John S. Preisser 《Journal of applied statistics》2018,45(7):1247-1259

The generalized Poisson (GP) regression model has been used to model count data that exhibit over-dispersion or under-dispersion. The zero-inflated GP (ZIGP) regression model can additionally handle count data characterized by many zeros. However, the parameters of ZIGP model cannot easily be used for inference on overall exposure effects. In order to address this problem, a marginalized ZIGP is proposed to directly model the population marginal mean count. The parameters of the marginalized zero-inflated GP model are estimated by the method of maximum likelihood. The regression model is illustrated by three real-life data sets. 相似文献

15.

Small area estimation of proportions in business surveys

《Journal of Statistical Computation and Simulation》2012,82(6):783-795

Binary data are often of interest in business surveys, particularly when the aim is to characterize grouping in the businesses making up the survey population. When small area estimates are required for such binary data, use of standard estimation methods based on linear mixed models (LMMs) becomes problematic. We explore two model-based techniques of small area estimation for small area proportions, the empirical best predictor (EBP) under a generalized linear mixed model and the model-based direct estimator (MBDE) under a population-level LMM. Our empirical results show that both the MBDE and the EBP perform well. The EBP is a computationally intensive method, whereas the MBDE is easy to implement. In case of model misspecification, the MBDE also appears to be more robust. The mean-squared error (MSE) estimation of MBDE is simple and straightforward, which is in contrast to the complicated MSE estimation for the EBP. 相似文献

16.

Robust small area estimation

Sanjoy K. Sinha J. N. K. Rao 《Revue canadienne de statistique》2009,37(3):381-399

Small area estimation has received considerable attention in recent years because of growing demand for small area statistics. Basic area‐level and unit‐level models have been studied in the literature to obtain empirical best linear unbiased prediction (EBLUP) estimators of small area means. Although this classical method is useful for estimating the small area means efficiently under normality assumptions, it can be highly influenced by the presence of outliers in the data. In this article, the authors investigate the robustness properties of the classical estimators and propose a resistant method for small area estimation, which is useful for downweighting any influential observations in the data when estimating the model parameters. To estimate the mean squared errors of the robust estimators of small area means, a parametric bootstrap method is adopted here, which is applicable to models with block diagonal covariance structures. Simulations are carried out to study the behaviour of the proposed robust estimators in the presence of outliers, and these estimators are also compared to the EBLUP estimators. Performance of the bootstrap mean squared error estimator is also investigated in the simulation study. The proposed robust method is also applied to some real data to estimate crop areas for counties in Iowa, using farm‐interview data on crop areas and LANDSAT satellite data as auxiliary information. The Canadian Journal of Statistics 37: 381–399; 2009 © 2009 Statistical Society of Canada 相似文献

17.

What Level of Statistical Model Should We Use in Small Area Estimation?

下载免费PDF全文

Mohammad‐Reza Namazi‐Rad David Steel 《Australian & New Zealand Journal of Statistics》2015,57(2):275-298

If unit‐level data are available, small area estimation (SAE) is usually based on models formulated at the unit level, but they are ultimately used to produce estimates at the area level and thus involve area‐level inferences. This paper investigates the circumstances under which using an area‐level model may be more effective. Linear mixed models (LMMs) fitted using different levels of data are applied in SAE to calculate synthetic estimators and empirical best linear unbiased predictors (EBLUPs). The performance of area‐level models is compared with unit‐level models when both individual and aggregate data are available. A key factor is whether there are substantial contextual effects. Ignoring these effects in unit‐level working models can cause biased estimates of regression parameters. The contextual effects can be automatically accounted for in the area‐level models. Using synthetic and EBLUP techniques, small area estimates based on different levels of LMMs are investigated in this paper by means of a simulation study. 相似文献

18.

Zero-inflated sum of Conway-Maxwell-Poissons (ZISCMP) regression

Kimberly F. Sellers Derek S. Young 《Journal of Statistical Computation and Simulation》2019,89(9):1649-1673

While excess zeros are often thought to cause data over-dispersion (i.e. when the variance exceeds the mean), this implication is not absolute. One should instead consider a flexible class of distributions that can address data dispersion along with excess zeros. This work develops a zero-inflated sum-of-Conway-Maxwell-Poissons (ZISCMP) regression as a flexible analysis tool to model count data that express significant data dispersion and contain excess zeros. This class of models contains several special case zero-inflated regressions, including zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), zero-inflated binomial (ZIB), and the zero-inflated Conway-Maxwell-Poisson (ZICMP). Through simulated and real data examples, we demonstrate class flexibility and usefulness. We further utilize it to analyze shark species data from Australia's Great Barrier Reef to assess the environmental impact of human action on the number of various species of sharks. 相似文献

19.

Small area estimation using unmatched sampling and linking models

Yong You J. N. K. Rao 《Revue canadienne de statistique》2002,30(1):3-15

The authors use a hierarchical Bayes approach to area level unmatched sampling and Unking models for small area estimation. Empirically they compare inferences under unmatched models with those obtained under the customary matched sampling and linking models. They apply the proposed method to Canadian census undercoverage estimation, developing a full hierarchical Bayes approach using Markov Chain Monte Carlo sampling methods. They show that the method can provide efficient model‐based estimates. They use posterior predictive distributions to assess model fit. 相似文献

20.

The relative performances of improved ridge estimators and an empirical bayes estimator: some monte carlo results

Fassil Nebebe Ah Boon Sim 《统计学通讯:理论与方法》2013,42(9):3469-3495

The relative 'performances of improved ridge estimators and an empirical Bayes estimator are studied by means of Monte Carlo simulations. The empirical Bayes method is seen to perform consistently better in terms of smaller MSE and more accurate empirical coverage than any of the estimators considered here. A bootstrap method is proposed to obtain more reliable estimates of the MSE of ridge esimators. Some theorems on the bootstrap for the ridge estimators are also given and they are used to provide an analytical understanding of the proposed bootstrap procedure. Empirical coverages of the ridge estimators based on the proposed procedure are generally closer to the nominal coverage when compared to their earlier counterparts. In general, except for a few cases, these coverages are still less accurate than the empirical coverages of the empirical Bayes estimator. 相似文献