首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The demand for reliable statistics in subpopulations, when only reduced sample sizes are available, has promoted the development of small area estimation methods. In particular, an approach that is now widely used is based on the seminal work by Battese et al. [An error-components model for prediction of county crop areas using survey and satellite data, J. Am. Statist. Assoc. 83 (1988), pp. 28–36] that uses linear mixed models (MM). We investigate alternatives when a linear MM does not hold because, on one side, linearity may not be assumed and/or, on the other, normality of the random effects may not be assumed. In particular, Opsomer et al. [Nonparametric small area estimation using penalized spline regression, J. R. Statist. Soc. Ser. B 70 (2008), pp. 265–283] propose an estimator that extends the linear MM approach to the case in which a linear relationship may not be assumed using penalized splines regression. From a very different perspective, Chambers and Tzavidis [M-quantile models for small area estimation, Biometrika 93 (2006), pp. 255–268] have recently proposed an approach for small-area estimation that is based on M-quantile (MQ) regression. This allows for models robust to outliers and to distributional assumptions on the errors and the area effects. However, when the functional form of the relationship between the qth MQ and the covariates is not linear, it can lead to biased estimates of the small area parameters. Pratesi et al. [Semiparametric M-quantile regression for estimating the proportion of acidic lakes in 8-digit HUCs of the Northeastern US, Environmetrics 19(7) (2008), pp. 687–701] apply an extended version of this approach for the estimation of the small area distribution function using a non-parametric specification of the conditional MQ of the response variable given the covariates [M. Pratesi, M.G. Ranalli, and N. Salvati, Nonparametric m-quantile regression using penalized splines, J. Nonparametric Stat. 21 (2009), pp. 287–304]. We will derive the small area estimator of the mean under this model, together with its mean-squared error estimator and compare its performance to the other estimators via simulations on both real and simulated data.  相似文献   

2.
M-quantile models with application to poverty mapping   总被引:1,自引:0,他引:1  
Over the last decade there has been growing demand for estimates of population characteristics at small area level. Unfortunately, cost constraints in the design of sample surveys lead to small sample sizes within these areas and as a result direct estimation, using only the survey data, is inappropriate since it yields estimates with unacceptable levels of precision. Small area models are designed to tackle the small sample size problem. The most popular class of models for small area estimation is random effects models that include random area effects to account for between area variations. However, such models also depend on strong distributional assumptions, require a formal specification of the random part of the model and do not easily allow for outlier robust inference. An alternative approach to small area estimation that is based on the use of M-quantile models was recently proposed by Chambers and Tzavidis (Biometrika 93(2):255–268, 2006) and Tzavidis and Chambers (Robust prediction of small area means and distributions. Working paper, 2007). Unlike traditional random effects models, M-quantile models do not depend on strong distributional assumption and automatically provide outlier robust inference. In this paper we illustrate for the first time how M-quantile models can be practically employed for deriving small area estimates of poverty and inequality. The methodology we propose improves the traditional poverty mapping methods in the following ways: (a) it enables the estimation of the distribution function of the study variable within the small area of interest both under an M-quantile and a random effects model, (b) it provides analytical, instead of empirical, estimation of the mean squared error of the M-quantile small area mean estimates and (c) it employs a robust to outliers estimation method. The methodology is applied to data from the 2002 Living Standards Measurement Survey (LSMS) in Albania for estimating (a) district level estimates of the incidence of poverty in Albania, (b) district level inequality measures and (c) the distribution function of household per-capita consumption expenditure in each district. Small area estimates of poverty and inequality show that the poorest Albanian districts are in the mountainous regions (north and north east) with the wealthiest districts, which are also linked with high levels of inequality, in the coastal (south west) and southern part of country. We discuss the practical advantages of our methodology and note the consistency of our results with results from previous studies. We further demonstrate the usefulness of the M-quantile estimation framework through design-based simulations based on two realistic survey data sets containing small area information and show that the M-quantile approach may be preferable when the aim is to estimate the small area distribution function.  相似文献   

3.
This article considers estimation of the unknown linear index coefficients of a model in which a number of nonparametrically identified reduced form parameters are assumed to be smooth and invertible function of one or more linear indices. The results extend the previous literature by allowing the number of reduced form parameters to exceed the number of indices (i.e., the indices are “overdetermined” by the reduced form parameters. The estimator of the unknown index coefficients (up to scale) is the eigenvector of a matrix (defined in terms of a first-step nonparametric estimator of the reduced form parameters) corresponding to its smallest (in magnitude) eigenvalue. Under suitable conditions, the proposed estimator is shown to be root-n-consistent and asymptotically normal, and under additional restrictions an efficient choice of a “weight matrix” is derived in the overdetermined case.  相似文献   

4.
从广义矩估计(GMM)到广义经验似然估计(GEL)的发展,是由于GMM估计量小样本性质的不足,促使人们寻求方法的改进和拓展。通过必要的证明和推导,详细解析GEL类估计量(包括EL,ET,CUE)的逻辑关系和数理结构,认识GEL的内在本质,并运用随机模拟方法证实了在小样本场合GEL类估计量比GMM估计量具有更小的估计偏差和均方误差,即GEL类估计改进了GMM估计的小样本性质。  相似文献   

5.
Small‐area estimation techniques have typically relied on plug‐in estimation based on models containing random area effects. More recently, regression M‐quantiles have been suggested for this purpose, thus avoiding conventional Gaussian assumptions, as well as problems associated with the specification of random effects. However, the plug‐in M‐quantile estimator for the small‐area mean can be shown to be the expected value of this mean with respect to a generally biased estimator of the small‐area cumulative distribution function of the characteristic of interest. To correct this problem, we propose a general framework for robust small‐area estimation, based on representing a small‐area estimator as a functional of a predictor of this small‐area cumulative distribution function. Key advantages of this framework are that it naturally leads to integrated estimation of small‐area means and quantiles and is not restricted to M‐quantile models. We also discuss mean squared error estimation for the resulting estimators, and demonstrate the advantages of our approach through model‐based and design‐based simulations, with the latter using economic data collected in an Australian farm survey.  相似文献   

6.
The empirical best linear unbiased predictor (EBLUP) is a linear shrinkage of the direct estimate toward the regression estimate and useful for the small area estimation in the sense of increasing precision of estimation of small area means. However, one potential difficulty of EBLUP is that the overall estimate for a larger geographical area based on a sum of EBLUP is not necessarily identical to the corresponding direct estimate like the overall sample mean. To fix this problem, the paper suggests a new method for benchmarking EBLUP in the Fay–Herriot model without assuming normality of random effects and sampling errors. The resulting benchmarked empirical linear shrinkage (BELS) predictor has novelty in the sense that coefficients for benchmarking are adjusted based on the data from each area. To measure the uncertainty of BELS, the second-order unbiased estimator of the mean squared error is derived.  相似文献   

7.
周巍等 《统计研究》2015,32(7):81-86
遥感影像是大数据的一种,利用遥感对农作物播种面积进行估算常采用回归估计量或校准估计量,通常都需要将地面样本数据与遥感分类信息相结合。但对于大多数回归估计量,对省级总体的农作物面积估算只能满足对省级总体的精度要求而不能分解到更小区域,比如县和乡级。本文利用黑龙江省2011年的地面实测样本数据结合遥感分类结果,构建了单元层次的多响应变量的多元回归形式的小域模型,并将小域效应设定为固定形式。这样基于回归估计方法,既可以估算分县的主要作物播种面积,也可以使得各县播种面积估计结果相加就等于回归模型含义下的省级总体的总量估计。对黑龙江省玉米、水稻、大豆分县小域估计结果的精度评价(变异系数C.V),平均而言均可以满足县级精度要求。本文的结果表明小域估计方法在解决省级总体对全省和分县的农作物种植面积多级估算问题中具有很好的应用。  相似文献   

8.
Summary.  Multilevel modelling is sometimes used for data from complex surveys involving multistage sampling, unequal sampling probabilities and stratification. We consider generalized linear mixed models and particularly the case of dichotomous responses. A pseudolikelihood approach for accommodating inverse probability weights in multilevel models with an arbitrary number of levels is implemented by using adaptive quadrature. A sandwich estimator is used to obtain standard errors that account for stratification and clustering. When level 1 weights are used that vary between elementary units in clusters, the scaling of the weights becomes important. We point out that not only variance components but also regression coefficients can be severely biased when the response is dichotomous. The pseudolikelihood methodology is applied to complex survey data on reading proficiency from the American sample of the 'Program for international student assessment' 2000 study, using the Stata program gllamm which can estimate a wide range of multilevel and latent variable models. Performance of pseudo-maximum-likelihood with different methods for handling level 1 weights is investigated in a Monte Carlo experiment. Pseudo-maximum-likelihood estimators of (conditional) regression coefficients perform well for large cluster sizes but are biased for small cluster sizes. In contrast, estimators of marginal effects perform well in both situations. We conclude that caution must be exercised in pseudo-maximum-likelihood estimation for small cluster sizes when level 1 weights are used.  相似文献   

9.
Estimators of the intercept parameter of a simple linear regression model involves the slope estimator. In this article, we consider the estimation of the intercept parameters of two linear regression models with normal errors, when it is a priori suspected that the two regression lines are parallel, but in doubt. We also introduce a coefficient of distrust as a measure of degree of lack of trust on the uncertain prior information regarding the equality of two slopes. Three different estimators of the intercept parameters are defined by using the sample data, the non sample uncertain prior information, an appropriate test statistic, and the coefficient of distrust. The relative performances of the unrestricted, shrinkage restricted and shrinkage preliminary test estimators are investigated based on the analyses of the bias and risk functions under quadratic loss. If the prior information is precise and the coefficient of distrust is small, the shrinkage preliminary test estimator overperforms the other estimators. An example based on a medical study is used to illustrate the method.  相似文献   

10.
In this work, we develop a method of adaptive non‐parametric estimation, based on ‘warped’ kernels. The aim is to estimate a real‐valued function s from a sample of random couples (X,Y). We deal with transformed data (Φ(X),Y), with Φ a one‐to‐one function, to build a collection of kernel estimators. The data‐driven bandwidth selection is performed with a method inspired by Goldenshluger and Lepski (Ann. Statist., 39, 2011, 1608). The method permits to handle various problems such as additive and multiplicative regression, conditional density estimation, hazard rate estimation based on randomly right‐censored data, and cumulative distribution function estimation from current‐status data. The interest is threefold. First, the squared‐bias/variance trade‐off is automatically realized. Next, non‐asymptotic risk bounds are derived. Lastly, the estimator is easily computed, thanks to its simple expression: a short simulation study is presented.  相似文献   

11.
Nonparametric estimation of the regression function for additive models is investigated in cases where the observed data are dependent. An additive kernel estimator for the regression function under some general mixing conditions is proposed. Under the mixing conditions, the additive kernel estimator is shown to be asymptotically normal.  相似文献   

12.
This paper considers the problem of estimating a nonlinear statistical model subject to stochastic linear constraints among unknown parameters. These constraints represent prior information which originates from a previous estimation of the same model using an alternative database. One feature of this specification allows for the disign matrix of stochastic linear restrictions to be estimated. The mixed regression technique and the maximum likelihood approach are used to derive the estimator for both the model coefficients and the unknown elements of this design matrix. The proposed estimator whose asymptotic properties are studied, contains as a special case the conventional mixed regression estimator based on a fixed design matrix. A new test of compatibility between prior and sample information is also introduced. Thesuggested estimator is tested empirically with both simulated and actual marketing data.  相似文献   

13.
This paper considers the problem of estimating a nonlinear statistical model subject to stochastic linear constraints among unknown parameters. These constraints represent prior information which originates from a previous estimation of the same model using an alternative database. One feature of this specification allows for the disign matrix of stochastic linear restrictions to be estimated. The mixed regression technique and the maximum likelihood approach are used to derive the estimator for both the model coefficients and the unknown elements of this design matrix. The proposed estimator whose asymptotic properties are studied, contains as a special case the conventional mixed regression estimator based on a fixed design matrix. A new test of compatibility between prior and sample information is also introduced. Thesuggested estimator is tested empirically with both simulated and actual marketing data.  相似文献   

14.
Abstract

In this article, when it is suspected that regression coefficients may be restricted to a subspace, we discuss the parameter estimation of regression coefficients in a multiple regression model. Then, in order to improve the preliminary test almost ridge estimator, we study the positive-rule Stein-type almost unbiased ridge estimator based on the positive-rule stein-type shrinkage estimator and almost unbiased ridge estimator. After that, quadratic bias and quadratic risk values of the new estimator are derived and compared with some relative estimators. And we also discuss the option of parameter k. Finally, we perform a real data example and a Monte Carlo study to illustrate theoretical results.  相似文献   

15.
Summary.  We develop a class of log-linear structural models that is suited to estimation of small area cross-classified counts based on survey data. This allows us to account for various associ- ation structures within the data and includes as a special case the restricted log-linear model underlying structure preserving estimation. The effect of survey design can be incorporated into estimation through the specification of an unbiased direct estimator and its associated covariance structure. We illustrate our approach by applying it to estimation of small area labour force characteristics in Norway.  相似文献   

16.
In this paper we define a finite mixture of quantile and M-quantile regression models for heterogeneous and /or for dependent/clustered data. Components of the finite mixture represent clusters of individuals with homogeneous values of model parameters. For its flexibility and ease of estimation, the proposed approaches can be extended to random coefficients with a higher dimension than the simple random intercept case. Estimation of model parameters is obtained through maximum likelihood, by implementing an EM-type algorithm. The standard error estimates for model parameters are obtained using the inverse of the observed information matrix, derived through the Oakes (J R Stat Soc Ser B 61:479–482, 1999) formula in the M-quantile setting, and through nonparametric bootstrap in the quantile case. We present a large scale simulation study to analyse the practical behaviour of the proposed model and to evaluate the empirical performance of the proposed standard error estimates for model parameters. We considered a variety of empirical settings in both the random intercept and the random coefficient case. The proposed modelling approaches are also applied to two well-known datasets which give further insights on their empirical behaviour.  相似文献   

17.
Random effects model can account for the lack of fitting a regression model and increase precision of estimating area‐level means. However, in case that the synthetic mean provides accurate estimates, the prior distribution may inflate an estimation error. Thus, it is desirable to consider the uncertain prior distribution, which is expressed as the mixture of a one‐point distribution and a proper prior distribution. In this paper, we develop an empirical Bayes approach for estimating area‐level means, using the uncertain prior distribution in the context of a natural exponential family, which we call the empirical uncertain Bayes (EUB) method. The regression model considered in this paper includes the Poisson‐gamma and the binomial‐beta, and the normal‐normal (Fay–Herriot) model, which are typically used in small area estimation. We obtain the estimators of hyperparameters based on the marginal likelihood by using a well‐known expectation‐maximization algorithm and propose the EUB estimators of area means. For risk evaluation of the EUB estimator, we derive a second‐order unbiased estimator of a conditional mean squared error by using some techniques of numerical calculation. Through simulation studies and real data applications, we evaluate a performance of the EUB estimator and compare it with the usual empirical Bayes estimator.  相似文献   

18.
Fuzzy least-square regression can be very sensitive to unusual data (e.g., outliers). In this article, we describe how to fit an alternative robust-regression estimator in fuzzy environment, which attempts to identify and ignore unusual data. The proposed approach concerns classical robust regression and estimation methods that are insensitive to outliers. In this regard, based on the least trimmed square estimation method, an estimation procedure is proposed for determining the coefficients of the fuzzy regression model for crisp input-fuzzy output data. The investigated fuzzy regression model is applied to bedload transport data forecasting suspended load by discharge based on a real world data. The accuracy of the proposed method is compared with the well-known fuzzy least-square regression model. The comparison results reveal that the fuzzy robust regression model performs better than the other models in suspended load estimation for the particular dataset. This comparison is done based on a similarity measure between fuzzy sets. The proposed model is general and can be used for modeling natural phenomena whose available observations are reported as imprecise rather than crisp.  相似文献   

19.
Much of the small‐area estimation literature focuses on population totals and means. However, users of survey data are often interested in the finite‐population distribution of a survey variable and in the measures (e.g. medians, quartiles, percentiles) that characterize the shape of this distribution at the small‐area level. In this paper we propose a model‐based direct estimator (MBDE, Chandra and Chambers) of the small‐area distribution function. The MBDE is defined as a weighted sum of sample data from the area of interest, with weights derived from the calibrated spline‐based estimate of the finite‐population distribution function introduced by Harms and Duchesne, under an appropriately specified regression model with random area effects. We also discuss the mean squared error estimation of the MBDE. Monte Carlo simulations based on both simulated and real data sets show that the proposed MBDE and its associated mean squared error estimator perform well when compared with alternative estimators of the area‐specific finite‐population distribution function.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号