共查询到20条相似文献,搜索用时 15 毫秒
1.
ABSTRACTMultivariate Fay-Herriot (MFH) models become popular methods to produce reliable parameter estimates of some related multiple characteristics of interest that are commonly produced from many surveys. This article studies the application of MFH models for estimating household consumption per capita expenditure (HCPE) on food and HCPE of non-food. Both of those associated direct estimates, which are obtained from the National Socioeconomic Surveys conducted regularly by Statistics Indonesia, have a strong correlation. The effects of correlation in MFH models are evaluated by employing a simulation study. The simulation showed that the strength of correlation between variables of interest, instead of the number of domains, plays a prominent role in MFH models. The application showed that MFH models have more efficient than univariate models in terms of standard errors of regression parameter estimates. The roots of mean squared errors (RMSEs) of the estimates obtained from the empirical best linear unbiased prediction (EBLUP) estimators of MFH models are smaller than RMSEs obtained from the direct estimators. Based on MFH model, the HCPE estimates of food by districts in Central Java, Indonesia, are higher than the HCPE estimates of non-food. The average of HCPE estimates of food and non-food in Central Java, Indonesia in 2015 are IDR 383,100.6 and IDR 280,653.6, respectively. 相似文献
2.
3.
Combining-100 information from multiple samples is often needed in biomedical and economic studies, but differences between these samples must be appropriately taken into account in the analysis of the combined data. We study the estimation for moment restriction models with data combined from two samples under an ignorability-type assumption while allowing for different marginal distributions of variables common to both samples. Suppose that an outcome regression (OR) model and a propensity score (PS) model are specified. By leveraging semi-parametric efficiency theory, we derive an augmented inverse probability-weighted (AIPW) estimator that is locally efficient and doubly robust with respect to these models. Furthermore, we develop calibrated regression and likelihood estimators that are not only locally efficient and doubly robust but also intrinsically efficient in achieving smaller variances than the AIPW estimator when the PS model is correctly specified but the OR model may be mispecified. As an important application, we study the two-sample instrumental variable problem and derive the corresponding estimators while allowing for incompatible distributions of variables common to the two samples. Finally, we provide a simulation study and an econometric application on public housing projects to demonstrate the superior performance of our improved estimators. The Canadian Journal of Statistics 48: 259–284; 2020 © 2019 Statistical Society of Canada 相似文献
4.
Hu Yang 《统计学通讯:理论与方法》2018,47(20):4958-4976
This paper focuses on robust estimation and variable selection for partially linear models. We combine the weighted least absolute deviation (WLAD) regression with the adaptive least absolute shrinkage and selection operator (LASSO) to achieve simultaneous robust estimation and variable selection for partially linear models. Compared with the LAD-LASSO method, the WLAD-LASSO method will resist to the heavy-tailed errors and outliers in the parametric components. In addition, we estimate the unknown smooth function by a robust local linear regression. Under some regular conditions, the theoretical properties of the proposed estimators are established. We further examine finite-sample performance of the proposed procedure by simulation studies and a real data example. 相似文献
5.
A nested-error regression model having both fixed and random effects is introduced to estimate linear parameters of small areas. The model is applicable to data having a proportion of domains where the variable of interest cannot be described by a standard linear mixed model. Algorithms and formulas to fit the model, to calculate EBLUP and to estimate mean-squared errors are given. A Monte Carlo simulation experiment is presented to illustrate the gain of precision obtained by using the proposed model and to obtain some practical conclusions. A motivating application to Spanish Labour Force Survey data is also given. 相似文献
6.
Chao Jia 《统计学通讯:模拟与计算》2017,46(2):815-822
Robust estimation methods are often used to eliminate or weaken the influences of gross errors on parameter estimation. However, different robust estimation methods may have different capabilities in eliminating or weakening gross errors. Taking unary linear regression as example, simulation experiments are used to compare 14 frequently used robust estimation methods. The current article summarizes the common characteristics and rules of the robust estimation methods. Finally, we confirm several relatively more efficient methods for unary linear regression. 相似文献
7.
Leonard A. Stefanski 《统计学通讯:理论与方法》2013,42(12):4335-4358
Let W be a normal random variable with mean μand known variance σ2. Conditions on the function f(·) are given under which there exists an unbiased estimator, f(W), of f(μ) for all real μ. In particular it is shown that f(·) must be an entire function over the complex plane. Infinite series solutions for F(·) are obtained which are shown to be valid under growth conditions of the derivatives, fk( ·), of f(·). Approximate solutions are given for the cases in which no exact solution exists. The theory is applied to nonlinear measurement-error models as a means of finding unbiased score functions when measurement error is normally distributed. Relative efficiencies comparing the proposed method to the use of conditional scores (Stefanski and Carroll, 1987) are given for the Poisson regression model with canonical link. 相似文献
8.
《Journal of Statistical Computation and Simulation》2012,82(12):2652-2669
This work studies outlier detection and robust estimation with data that are naturally distributed into groups and which follow approximately a linear regression model with fixed group effects. For this, several methods are considered. First, the robust fitting method of Peña and Yohai [A fast procedure for outlier diagnostics in large regression problems. J Am Stat Assoc. 1999;94:434–445], called principal sensitivity components (PSC) method, is adapted to the grouped data structure and the mentioned model. The robust methods RDL1 of Hubert and Rousseeuw [Robust regression with both continuous and binary regressors. J Stat Plan Inference. 1997;57:153–163] and M-S of Maronna and Yohai [Robust regression with both continuous and categorical predictors. Journal of Statistical Planning and Inference 2000;89:197–214] are also considered. These three methods are compared in terms of their effectiveness in outlier detection and their robustness through simulations, considering several contamination scenarios and growing contamination levels. Results indicate that the adapted PSC procedure is able to detect a high percentage of true outliers and a small number of false outliers. It is appropriate when the contamination is in the error term or in the covariates, detecting also possibly masked high leverage points. Moreover, in simulations the final robust regression estimator preserved good efficiency under Normality while keeping good robustness properties. 相似文献
9.
By approximating the nonparametric component using a regression spline in generalized partial linear models (GPLM), robust generalized estimating equations (GEE), involving bounded score function and leverage-based weighting function, can be used to estimate the regression parameters in GPLM robustly for longitudinal data or clustered data. In this paper, score test statistics are proposed for testing the regression parameters with robustness, and their asymptotic distributions under the null hypothesis and a class of local alternative hypotheses are studied. The proposed score tests reply on the estimation of a smaller model without the testing parameters involved, and perform well in the simulation studies and real data analysis conducted in this paper. 相似文献
10.
《Journal of Statistical Computation and Simulation》2012,82(4):359-376
It is well known that Gaussian maximum likelihood estimates of time series models are not robust. In this paper we prove this is also the case for the Generalized Autoregressive Conditional Heteroscedastic (GARCH) models. By expressing the Gaussian maximum likelihood estimates as Ψ estimates and by assuming the existence of a contaminated process, we prove they possess zero breakdown point and unbounded influence curves. By simulating GARCH processes under several proportions of contaminations we assess how much biased the maximum likelihood estimates may become and compare these results to a robust alternative. The t-student maximum likelihood estimates of GARCH models are also considered. 相似文献
11.
E. Andrés Houseman 《Journal of the Royal Statistical Society. Series C, Applied statistics》2005,54(4):769-780
Summary. Time series arise often in environmental monitoring settings, which typically involve measuring processes repeatedly over time. In many such applications, observations are irregularly spaced and, additionally, are not distributed normally. An example is water monitoring data collected in Boston Harbor by the Massachusetts Water Resources Authority. We describe a simple robust approach for estimating regression parameters and a first-order autocorrelation parameter in a time series where the observations are irregularly spaced. Estimates are obtained from an estimating equation that is constructed as a linear combination of estimated innovation errors, suitably made robust by symmetric and possibly bounded functions. Under an assumption of data missing completely at random and mild regularity conditions, the proposed estimating equation yields consistent and asymptotically normal estimates. Simulations suggest that our estimator performs well in moderate sample sizes. We demonstrate our method on Secchi depth data collected from Boston Harbor. 相似文献
12.
M. V. Kulikova 《Journal of applied statistics》2013,40(3):495-507
This paper is concerned with the volatility modeling of a set of South African Rand (ZAR) exchange rates. We investigate the quasi-maximum-likelihood (QML) estimator based on the Kalman filter and explore how well a choice of stochastic volatility (SV) models fits the data. We note that a data set from a developing country is used. The main results are: (1) the SV model parameter estimates are in line with those reported from the analysis of high-frequency data for developed countries; (2) the SV models we considered, along with their corresponding QML estimators, fit the data well; (3) using the range return instead of the absolute return as a volatility proxy produces QML estimates that are both less biased and less variable; (4) although the log range of the ZAR exchange rates has a distribution that is quite far from normal, the corresponding QML estimator has a superior performance when compared with the log absolute return. 相似文献
13.
Ryan Janicki 《统计学通讯:理论与方法》2020,49(9):2264-2284
AbstractLinear mixed effects models have been popular in small area estimation problems for modeling survey data when the sample size in one or more areas is too small for reliable inference. However, when the data are restricted to a bounded interval, the linear model may be inappropriate, particularly if the data are near the boundary. Nonlinear sampling models are becoming increasingly popular for small area estimation problems when the normal model is inadequate. This paper studies the use of a beta distribution as an alternative to the normal distribution as a sampling model for survey estimates of proportions which take values in (0, 1). Inference for small area proportions based on the posterior distribution of a beta regression model ensures that point estimates and credible intervals take values in (0, 1). Properties of a hierarchical Bayesian small area model with a beta sampling distribution and logistic link function are presented and compared to those of the linear mixed effect model. Propriety of the posterior distribution using certain noninformative priors is shown, and behavior of the posterior mean as a function of the sampling variance and the model variance is described. An example using 2010 Small Area Income and Poverty Estimates (SAIPE) data is given, and a numerical example studying small sample properties of the model is presented. 相似文献
14.
In this paper, a new small domain estimator for area-level data is proposed. The proposed estimator is driven by a real problem of estimating the mean price of habitation transaction at a regional level in a European country, using data collected from a longitudinal survey conducted by a national statistical office. At the desired level of inference, it is not possible to provide accurate direct estimates because the sample sizes in these domains are very small. An area-level model with a heterogeneous covariance structure of random effects assists the proposed combined estimator. This model is an extension of a model due to Fay and Herriot [5], but it integrates information across domains and over several periods of time. In addition, a modified method of estimation of variance components for time-series and cross-sectional area-level models is proposed by including the design weights. A Monte Carlo simulation, based on real data, is conducted to investigate the performance of the proposed estimators in comparison with other estimators frequently used in small area estimation problems. In particular, we compare the performance of these estimators with the estimator based on the Rao–Yu model [23]. The simulation study also accesses the performance of the modified variance component estimators in comparison with the traditional ANOVA method. Simulation results show that the estimators proposed perform better than the other estimators in terms of both precision and bias. 相似文献
15.
In longitudinal studies, missing responses and mismeasured covariates are commonly seen due to the data collection process. Without cautiousness in data analysis, inferences from the standard statistical approaches may lead to wrong conclusions. In order to improve the estimation for longitudinal data analysis, a doubly robust estimation method for partially linear models, which can simultaneously account for the missing responses and mismeasured covariates, is proposed. Imprecisions of covariates are corrected by taking advantage of the independence between replicate measurement errors, and missing responses are handled by the doubly robust estimation under the mechanism of missing at random. The asymptotic properties of the proposed estimators are established under regularity conditions, and simulation studies demonstrate desired properties. Finally, the proposed method is applied to data from the Lifestyle Education for Activity and Nutrition study. 相似文献
16.
Richard Stevens 《Journal of applied statistics》2003,30(9):967-981
When a published statistical model is also distributed as computer software, it will usually be desirable to present the outputs as interval, as well as point, estimates. The present paper compares three methods for approximate interval estimation about a model output, for use when the model form does not permit an exact interval estimate. The methods considered are first-order asymptotics, using second derivatives of the log-likelihood to estimate variance information; higher-order asymptotics based on the signed-root transformation; and the non-parametric bootstrap. The signed-root method is Bayesian, and uses an approximation for posterior moments that has not previously been tested in a real-world application. Use of the three methods is illustrated with reference to a software project arising in medical decision-making, the UKPDS Risk Engine. Intervals from the first-order and signed-root methods are near- identical, and typically 1% wider to 7% narrower than those from the non-parametric bootstrap. The asymptotic methods are markedly faster than the bootstrap method. 相似文献
17.
《Journal of Statistical Computation and Simulation》2012,82(1):73-80
The inverse Gaussian (IG) distribution is often applied in statistical modelling, especially with lifetime data. We present tests for outlying values of the parameters (μ, λ) of this distribution when data are available from a sample of independent units and possibly with more than one event per unit. Outlier tests are constructed from likelihood ratio tests for equality of parameters. The test for an outlying value of λ is based on an F-distributed statistic that is transformed to an approximate normal statistic when there are unequal numbers of events per unit. Simulation studies are used to confirm that Bonferroni tests have accurate size and to examine the powers of the tests. The application to first hitting time models, where the IG distribution is derived from an underlying Wiener process, is described. The tests are illustrated on data concerning the strength of different lots of insulating material. 相似文献
18.
《Journal of Statistical Computation and Simulation》2012,82(12):2364-2377
Motivated by the Singapore Longitudinal Aging Study (SLAS), we propose a Bayesian approach for the estimation of semiparametric varying-coefficient models for longitudinal continuous and cross-sectional binary responses. These models have proved to be more flexible than simple parametric regression models. Our development is a new contribution towards their Bayesian solution, which eases computational complexity. We also consider adapting all kinds of familiar statistical strategies to address the missing data issue in the SLAS. Our simulation results indicate that a Bayesian imputation (BI) approach performs better than complete-case (CC) and available-case (AC) approaches, especially under small sample designs, and may provide more useful results in practice. In the real data analysis for the SLAS, the results for longitudinal outcomes from BI are similar to AC analysis, differing from those with CC analysis. 相似文献
19.
Bouchra R. Nasri Bruno N. Rémillard Mamadou Y. Thioub 《Revue canadienne de statistique》2020,48(1):79-96
We consider several time series, and for each of them, we fit an appropriate dynamic parametric model. This produces serially independent error terms for each time series. The dependence between these error terms is then modelled by a regime-switching copula. The EM algorithm is used for estimating the parameters and a sequential goodness-of-fit procedure based on Cramér–von Mises statistics is proposed to select the appropriate number of regimes. Numerical experiments are performed to assess the validity of the proposed methodology. As an example of application, we evaluate a European put-on-max option on the returns of two assets. To facilitate the use of our methodology, we have built a R package HMMcopula available on CRAN. The Canadian Journal of Statistics 48: 79–96; 2020 © 2020 Statistical Society of Canada 相似文献
20.
Hao Qu 《统计学通讯:模拟与计算》2013,42(9):2539-2551
ABSTRACTThis paper considers panel data models with fixed effects which have grouped patterns with unknown group membership. A two-stage estimation (TSE) procedure is developed to improve the properties of the GFE estimators of common parameters when the time span is small. Firstly, the common parameters are estimated. Subsequently, the optimal group assignment and the estimators of group effects are obtained by the K-means algorithm. Monte Carlo results reveal that the TSE estimator has a much smaller bias than the GFE estimator when the values of difference between effects are moderately small or at high variance of the idiosyncratic error. 相似文献