期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Exploring spatial dependence in area-level random effect model for disaggregate-level crop yield estimation

Hukum Chandra 《Journal of applied statistics》2013,40(4):823-842

This paper describes an application of small area estimation (SAE) techniques under area-level spatial random effect models when only area (or district or aggregated) level data are available. In particular, the SAE approach is applied to produce district-level model-based estimates of crop yield for paddy in the state of Uttar Pradesh in India using the data on crop-cutting experiments supervised under the Improvement of Crop Statistics scheme and the secondary data from the Population Census. The diagnostic measures are illustrated to examine the model assumptions as well as reliability and validity of the generated model-based small area estimates. The results show a considerable gain in precision in model-based estimates produced applying SAE. Furthermore, the model-based estimates obtained by exploiting spatial information are more efficient than the one obtained by ignoring this information. However, both of these model-based estimates are more efficient than the direct survey estimate. In many districts, there is no survey data and therefore it is not possible to produce direct survey estimates for these districts. The model-based estimates generated using SAE are still reliable for such districts. These estimates produced by using SAE will provide invaluable information to policy-analysts and decision-makers. 相似文献

2.

Disaggregate-level estimates of indebtedness in the state of Uttar Pradesh in India: an application of small-area estimation technique

Hukum Chandra Nicola Salvati U. C. Sud 《Journal of applied statistics》2011,38(11):2413-2432

The National Sample Survey Organisation (NSSO) surveys are the main source of official statistics in India, and generate a range of invaluable data at the macro level (e.g. state and national levels). However, the NSSO data cannot be used directly to produce reliable estimates at the micro level (e.g. district or further disaggregate level) due to small sample sizes. There is a rapidly growing demand of such micro-level statistics in India, as the country is moving from centralized to more decentralized planning system. In this article, we employ small-area estimation (SAE) techniques to derive model-based estimates of the proportion of indebted households at district or at other small-area levels in the state of Uttar Pradesh in India by linking data from the Debt–Investment Survey 2002–2003 of NSSO and the Population Census 2001 and the Agriculture Census 2003. Our results show that the model-based estimates are precise and representative. For many small areas, it is even not possible to produce estimates using sample data alone. The model-based estimates generated using SAE are still reliable for such areas. The estimates are expected to provide invaluable information to policy analysts and decision-makers. 相似文献

3.

Small Area Estimation Using Estimated Population Level Auxiliary Data

Hukum Chandra U. C. Sud Yogita Gharde 《统计学通讯:模拟与计算》2015,44(5):1197-1209

Unit level linear mixed models are often used in small area estimation (SAE), and the empirical best linear unbiased prediction (EBLUP) is widely used for the estimation of small area means under such models. However, EBLUP requires population level auxiliary data, atleast area specific aggregated values. Sometimes population level auxiliary data is either not available or not consistent with the survey data. We describe a SAE method that uses estimated population auxiliary information. Empirical results show that proposed method for SAE produces an efficient set of small area estimates. 相似文献

4.

What Level of Statistical Model Should We Use in Small Area Estimation?

下载免费PDF全文

Mohammad‐Reza Namazi‐Rad David Steel 《Australian & New Zealand Journal of Statistics》2015,57(2):275-298

If unit‐level data are available, small area estimation (SAE) is usually based on models formulated at the unit level, but they are ultimately used to produce estimates at the area level and thus involve area‐level inferences. This paper investigates the circumstances under which using an area‐level model may be more effective. Linear mixed models (LMMs) fitted using different levels of data are applied in SAE to calculate synthetic estimators and empirical best linear unbiased predictors (EBLUPs). The performance of area‐level models is compared with unit‐level models when both individual and aggregate data are available. A key factor is whether there are substantial contextual effects. Ignoring these effects in unit‐level working models can cause biased estimates of regression parameters. The contextual effects can be automatically accounted for in the area‐level models. Using synthetic and EBLUP techniques, small area estimates based on different levels of LMMs are investigated in this paper by means of a simulation study. 相似文献

5.

Small Area Estimation for Zero-Inflated Data

Hukum Chandra U. C. Sud 《统计学通讯:模拟与计算》2013,42(5):632-643

The commonly used method of small area estimation (SAE) under a linear mixed model may not be efficient if data contain substantial proportion of zeros than would be expected under standard model assumptions (hereafter zero-inflated data). The authors discuss the SAE for zero-inflated data under a two-part random effects model that account for excess zeros in the data. Empirical results show that proposed method for SAE works well and produces an efficient set of small area estimates. An application to real survey data from the National Sample Survey Office of India demonstrates the satisfactory performance of the method. The authors describe a parametric bootstrap method to estimate the mean squared error (MSE) of the proposed estimator of small areas. The bootstrap estimates of the MSE are compared to the true MSE in simulation study. 相似文献

6.

Spatial generalized linear mixed models in small area estimation

Mahmoud Torabi 《Revue canadienne de statistique》2019,47(3):426-437

In survey sampling, policy decisions regarding the allocation of resources to sub‐groups of a population depend on reliable predictors of their underlying parameters. However, in some sub‐groups, called small areas due to small sample sizes relative to the population, the information needed for reliable estimation is typically not available. Consequently, data on a coarser scale are used to predict the characteristics of small areas. Mixed models are the primary tools in small area estimation (SAE) and also borrow information from alternative sources (e.g., previous surveys and administrative and census data sets). In many circumstances, small area predictors are associated with location. For instance, in the case of chronic disease or cancer, it is important for policy makers to understand spatial patterns of disease in order to determine small areas with high risk of disease and establish prevention strategies. The literature considering SAE with spatial random effects is sparse and mostly in the context of spatial linear mixed models. In this article, small area models are proposed for the class of spatial generalized linear mixed models to obtain small area predictors and corresponding second‐order unbiased mean squared prediction errors via Taylor expansion and a parametric bootstrap approach. The performance of the proposed approach is evaluated through simulation studies and application of the models to a real esophageal cancer data set from Minnesota, U.S.A. The Canadian Journal of Statistics 47: 426–437; 2019 © 2019 Statistical Society of Canada 相似文献

7.

A spatial Bayesian semiparametric mixture model for positive definite matrices with applications in diffusion tensor imaging

Zhou Lan Brian J. Reich Dipankar Bandyopadhyay 《Revue canadienne de statistique》2021,49(1):129-149

Studies on diffusion tensor imaging (DTI) quantify the diffusion of water molecules in a brain voxel using an estimated 3 × 3 symmetric positive definite (p.d.) diffusion tensor matrix. Due to the challenges associated with modelling matrix‐variate responses, the voxel‐level DTI data are usually summarized by univariate quantities, such as fractional anisotropy. This approach leads to evident loss of information. Furthermore, DTI analyses often ignore the spatial association among neighbouring voxels, leading to imprecise estimates. Although the spatial modelling literature is rich, modelling spatially dependent p.d. matrices is challenging. To mitigate these issues, we propose a matrix‐variate Bayesian semiparametric mixture model, where the p.d. matrices are distributed as a mixture of inverse Wishart distributions, with the spatial dependence captured by a Markov model for the mixture component labels. Related Bayesian computing is facilitated by conjugacy results and use of the double Metropolis–Hastings algorithm. Our simulation study shows that the proposed method is more powerful than competing non‐spatial methods. We also apply our method to investigate the effect of cocaine use on brain microstructure. By extending spatial statistics to matrix‐variate data, we contribute to providing a novel and computationally tractable inferential tool for DTI analysis. 相似文献

8.

Generalised Linear Models Incorporating Population Level Information: An Empirical Likelihood Based Approach

Chaudhuri S Handcock MS Rendall MS 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2008,70(2):311-328

In many situations information from a sample of individuals can be supplemented by population level information on the relationship between a dependent variable and explanatory variables. Inclusion of the population level information can reduce bias and increase the efficiency of the parameter estimates.Population level information can be incorporated via constraints on functions of the model parameters. In general the constraints are nonlinear making the task of maximum likelihood estimation harder. In this paper we develop an alternative approach exploiting the notion of an empirical likelihood. It is shown that within the framework of generalised linear models, the population level information corresponds to linear constraints, which are comparatively easy to handle. We provide a two-step algorithm that produces parameter estimates using only unconstrained estimation. We also provide computable expressions for the standard errors. We give an application to demographic hazard modelling by combining panel survey data with birth registration data to estimate annual birth probabilities by parity. 相似文献

9.

TR Multivariate Conditional Median Estimation

Jan G. De Gooijer Ali Gannoun 《统计学通讯:模拟与计算》2013,42(1):165-176

An affine equivariant version of the nonparametric spatial conditional median (SCM) is constructed, using an adaptive transformation–retransformation (TR) procedure. The relative performance of SCM estimates, computed with and without applying the TR-procedure, are compared through simulations. Also included is the vector of coordinate conditional, kernel-based medians (VCCMs). The methodology is illustrated via an empirical data set. The simulations indicate that the TR-SCM estimator is more efficient than the SCM estimator for data generated from asymmetric contaminated trivariate normals. However, when the dimension of the covariates increases the efficiency of the TR-SCM estimator decreases. The TR-VCCM- and VCCM estimators lack efficiency, and consequently should not be used in practice. 相似文献

10.

Modelling bias in combining small area prevalence estimates from multiple surveys

Manzi G Spiegelhalter DJ Turner RM Flowers J Thompson SG 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2011,174(1):31-50

相似文献

11.

On the Bayesian analysis of the mixture of power function distribution using the complete and the censored sample

M. Saleem M. Aslam P. Economou 《Journal of applied statistics》2010,37(1):25-40

The power function distribution is often used to study the electrical component reliability. In this paper, we model a heterogeneous population using the two-component mixture of the power function distribution. A comprehensive simulation scheme including a large number of parameter points is followed to highlight the properties and behavior of the estimates in terms of sample size, censoring rate, parameters size and the proportion of the components of the mixture. The parameters of the power function mixture are estimated and compared using the Bayes estimates. A simulated mixture data with censored observations is generated by probabilistic mixing for the computational purposes. Elegant closed form expressions for the Bayes estimators and their variances are derived for the censored sample as well as for the complete sample. Some interesting comparison and properties of the estimates are observed and presented. The system of three non-linear equations, required to be solved iteratively for the computations of maximum likelihood (ML) estimates, is derived. The complete sample expressions for the ML estimates and for their variances are also given. The components of the information matrix are constructed as well. Uninformative as well as informative priors are assumed for the derivation of the Bayes estimators. A real-life mixture data example has also been discussed. The posterior predictive distribution with the informative Gamma prior is derived, and the equations required to find the lower and upper limits of the predictive intervals are constructed. The Bayes estimates are evaluated under the squared error loss function. 相似文献

12.

Fluid flow pattern analysis in a trough region: a nonparametric approach

Rahul Mazumder 《Journal of applied statistics》2008,35(6):633-645

This paper aims at identifying statistically different circulation patterns characterising fluid flow in the trough region between two adjacent asymmetric waveforms, using the velocity data collected by 3D acoustic Doppler velocimeter. Statistical clustering has been performed using ideas originating from information theory and scale space theory in computer vision for splitting the trough region into different spatially connected segments (identifying the circulation bubble in the process) on the basis of circulation patterns. The paper attempts to visualise the fluid fluctuations in the trough region, with emphasis on the circulation region, by simulating the directional fluctuations of fluid particles from the kernel density estimates learned from the experimental data. The image representation of the estimate of the spatial turbulent kinetic energy (TKE) function reveals interesting features corresponding to the regions of high TKE, suggesting the possibilities for further research in this area along the lines of feature extraction and image analysis. 相似文献

13.

Analyzing non-stationarity in cement stone pit by median polish interpolation: a case study

Bulent Tutmez 《Journal of applied statistics》2014,41(2):454-466

The raw materials utilized in the manufacture of cement comprise mainly of lime, silica, alumina and iron oxide. Spatial evaluation of these main chemical constituents of cement has crucial importance for providing effective production. Because these components are composed of some raw materials such as limestone and marl, the spatial relationships in a calcareous marl stone pit was taken into consideration. In practice, spatial field data taken from a cement quarry may include some variations and trends. For modeling and removing spatial trend in a cement raw material quarry as well as providing unbiased estimates, median polish kriging was used. By using the variation of the data itself, some approximations and interpolations were carried out. It was recorded that the method obtained outlier-resistant estimation of spatial trend without needing an external exploratory variable. In addition, it provided very effective estimations and additional information for analyzing spatial non-stationary data. 相似文献

14.

Estimation and prediction for Chen distribution with bathtub shape under progressive censoring

Tanmay Kayal Devendra Pratap Singh Manoj Kumar Rastogi 《Journal of Statistical Computation and Simulation》2017,87(2):348-366

We consider estimation of the unknown parameters of Chen distribution [Chen Z. A new two-parameter lifetime distribution with bathtub shape or increasing failure rate function. Statist Probab Lett. 2000;49:155–161] with bathtub shape using progressive-censored samples. We obtain maximum likelihood estimates by making use of an expectation–maximization algorithm. Different Bayes estimates are derived under squared error and balanced squared error loss functions. It is observed that the associated posterior distribution appears in an intractable form. So we have used an approximation method to compute these estimates. A Metropolis–Hasting algorithm is also proposed and some more approximate Bayes estimates are obtained. Asymptotic confidence interval is constructed using observed Fisher information matrix. Bootstrap intervals are proposed as well. Sample generated from MH algorithm are further used in the construction of HPD intervals. Finally, we have obtained prediction intervals and estimates for future observations in one- and two-sample situations. A numerical study is conducted to compare the performance of proposed methods using simulations. Finally, we analyse real data sets for illustration purposes. 相似文献

15.

Small area estimation strategies for large population surveys: a comparison of design and model-based methods

Zhaonan Li Xinyi Xu 《Journal of Statistical Computation and Simulation》2017,87(4):817-833

Small area estimation (SAE) concerns with how to reliably estimate population quantities of interest when some areas or domains have very limited samples. This is an important issue in large population surveys, because the geographical areas or groups with only small samples or even no samples are often of interest to researchers and policy-makers. For example, large population health surveys, such as Behavioural Risk Factor Surveillance System and Ohio Mecaid Assessment Survey (OMAS), are regularly conducted for monitoring insurance coverage and healthcare utilization. Classic approaches usually provide accurate estimators at the state level or large geographical region level, but they fail to provide reliable estimators for many rural counties where the samples are sparse. Moreover, a systematic evaluation of the performances of the SAE methods in real-world setting is lacking in the literature. In this paper, we propose a Bayesian hierarchical model with constraints on the parameter space and show that it provides superior estimators for county-level adult uninsured rates in Ohio based on the 2012 OMAS data. Furthermore, we perform extensive simulation studies to compare our methods with a collection of common SAE strategies, including direct estimators, synthetic estimators, composite estimators, and Datta GS, Ghosh M, Steorts R, Maples J.'s [Bayesian benchmarking with applications to small area estimation. Test 2011;20(3):574–588] Bayesian hierarchical model-based estimators. To set a fair basis for comparison, we generate our simulation data with characteristics mimicking the real OMAS data, so that neither model-based nor design-based strategies use the true model specification. The estimators based on our proposed model are shown to outperform other estimators for small areas in both simulation study and real data analysis. 相似文献

16.

Covariate Decomposition Methods for Longitudinal Missing‐at‐Random Data and Predictors Associated with Subject‐Specific Effects

John M. Neuhaus Charles E. McCulloch 《Australian & New Zealand Journal of Statistics》2014,56(4):331-345

Investigators often gather longitudinal data to assess changes in responses over time within subjects and to relate these changes to within‐subject changes in predictors. Missing data are common in such studies and predictors can be correlated with subject‐specific effects. Maximum likelihood methods for generalized linear mixed models provide consistent estimates when the data are ‘missing at random’ (MAR) but can produce inconsistent estimates in settings where the random effects are correlated with one of the predictors. On the other hand, conditional maximum likelihood methods (and closely related maximum likelihood methods that partition covariates into between‐ and within‐cluster components) provide consistent estimation when random effects are correlated with predictors but can produce inconsistent covariate effect estimates when data are MAR. Using theory, simulation studies, and fits to example data this paper shows that decomposition methods using complete covariate information produce consistent estimates. In some practical cases these methods, that ostensibly require complete covariate information, actually only involve the observed covariates. These results offer an easy‐to‐use approach to simultaneously protect against bias from both cluster‐level confounding and MAR missingness in assessments of change. 相似文献

17.

Estimation under modified Weibull distribution based on right censored generalized order statistics

Saieed F. Ateya 《Journal of applied statistics》2013,40(12):2720-2734

In this paper, the maximum likelihood (ML) and Bayes, by using Markov chain Monte Carlo (MCMC), methods are considered to estimate the parameters of three-parameter modified Weibull distribution (MWD(β, τ, λ)) based on a right censored sample of generalized order statistics (gos). Simulation experiments are conducted to demonstrate the efficiency of the proposed methods. Some comparisons are carried out between the ML and Bayes methods by computing the mean squared errors (MSEs), Akaike's information criteria (AIC) and Bayesian information criteria (BIC) of the estimates to illustrate the paper. Three real data sets from Weibull(α, β) distribution are introduced and analyzed using the MWD(β, τ, λ) and also using the Weibull(α, β) distribution. A comparison is carried out between the mentioned models based on the corresponding Kolmogorov–Smirnov (K–S) test statistic, {AIC and BIC} to emphasize that the MWD(β, τ, λ) fits the data better than the other distribution. All parameters are estimated based on type-II censored sample, censored upper record values and progressively type-II censored sample which are generated from the real data sets. 相似文献

18.

Practical Maximum Pseudolikelihood for Spatial Point Patterns(with Discussion) 总被引：3，自引：0，他引：3

Adrian Baddeley & Rolf Turner 《Australian & New Zealand Journal of Statistics》2000,42(3):283-322

This paper describes a technique for computing approximate maximum pseudolikelihood estimates of the parameters of a spatial point process. The method is an extension of Berman & Turner's (1992) device for maximizing the likelihoods of inhomogeneous spatial Poisson processes. For a very wide class of spatial point process models the likelihood is intractable, while the pseudolikelihood is known explicitly, except for the computation of an integral over the sampling region. Approximation of this integral by a finite sum in a special way yields an approximate pseudolikelihood which is formally equivalent to the (weighted) likelihood of a loglinear model with Poisson responses. This can be maximized using standard statistical software for generalized linear or additive models, provided the conditional intensity of the process takes an 'exponential family' form. Using this approach a wide variety of spatial point process models of Gibbs type can be fitted rapidly, incorporating spatial trends, interaction between points, dependence on spatial covariates, and mark information. 相似文献

19.

Spatial robust small area estimation 总被引：1，自引：0，他引：1

Timo Schmid Ralf T. Münnich 《Statistical Papers》2014,55(3):653-670

The accuracy of recent applications in small area statistics in many cases highly depends on the assumed properties of the underlying models and the availability of micro information. In finite population sampling, small sample sizes may increase the sensitivity of the modeling with respect to single units. In these cases, area-specific sample sizes tend to be small such that normal assumptions, even of area means, seem to be violated. Hence, applying robust estimation methods is expected to yield more reliable results. In general, two robust small area methods are applied, the robust EBLUP and the M-quantile method. Additionally, the use of adequate auxiliary information may further increase the accuracy of the estimates. In prediction based approaches where information is needed on universe level, in general, only few variables are available which can be used for modeling. In addition to variables from the dataset, in many cases further information may be available, e.g. geographical information which could indicate spatial dependencies between neighboring areas. This spatial information can be included in the modeling using spatially correlated area effects. Within the paper the classical robust EBLUP is extended to cover spatial area effects via a simultaneous autoregressive model. The performance of the different estimators are compared in a model-based simulation study. 相似文献

20.

Estimating avian dispersal distances from data on ringed birds

David Thomson Arie van Noordwijk Ward Hagemeijer 《Journal of applied statistics》2003,30(9):1003-1008

Data from birds ringed as chicks and recaptured during subsequent breeding seasons provide information on avian natal dispersal distances. However, national patterns of ring reports are influenced by recapture rates as well as by dispersal rates. While an extensive methodology has been developed to study survival rates using models that correct for recapture rates, the same is not true for dispersal. Here, we present such a method, showing how corrections for spatial heterogeneity in recapture rate can be built into estimates of dispersal rates if detailed atlas data and ringing totals can be combined with extensive data on birds ringed as chicks and recaptured as breeding adults. We show how the method can be implemented in the software package SURVIV (White, 1992). 相似文献