首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
A previously known result in the econometrics literature is that when covariates of an underlying data generating process are jointly normally distributed, estimates from a nonlinear model that is misspecified as linear can be interpreted as average marginal effects. This has been shown for models with exogenous covariates and separability between covariates and errors. In this paper, we extend this identification result to a variety of more general cases, in particular for combinations of separable and nonseparable models under both exogeneity and endogeneity. So long as the underlying model belongs to one of these large classes of data generating processes, our results show that nothing else must be known about the true DGP—beyond normality of observable data, a testable assumption—in order for linear estimators to be interpretable as average marginal effects. We use simulation to explore the performance of these estimators using a misspecified linear model and show they perform well when the data are normal but can perform poorly when this is not the case.  相似文献   

2.
Existing estimators of a finite population distribution function that utilize auxiliary information are often constructed by a point wise argument. As a result, these estimators are not always monotone. We adopt a functional approach to the problem and propose two estimators based on compositions of functions. Asymptotic variance formulae are derived for the proposed es-timators. Comparisons are made with existing estimators in a simulation study using three natural populations.  相似文献   

3.
Abstract. It is quite common in epidemiology that we wish to assess the quality of estimators on a particular set of information, whereas the estimators may use a larger set of information. Two examples are studied: the first occurs when we construct a model for an event which happens if a continuous variable is above a certain threshold. We can compare estimators based on the observation of only the event or on the whole continuous variable. The other example is that of predicting the survival based only on survival information or using in addition information on a disease. We develop modified Akaike information criterion (AIC) and Likelihood cross‐validation (LCV) criteria to compare estimators in this non‐standard situation. We show that a normalized difference of AIC has a bias equal to o ( n ? 1 ) if the estimators are based on well‐specified models; a normalized difference of LCV always has a bias equal to o ( n ? 1 ). A simulation study shows that both criteria work well, although the normalized difference of LCV tends to be better and is more robust. Moreover in the case of well‐specified models the difference of risks boils down to the difference of statistical risks which can be rather precisely estimated. For ‘compatible’ models the difference of risks is often the main term but there can also be a difference of mis‐specification risks.  相似文献   

4.
Mixed effects models and Berkson measurement error models are widely used. They share features which the author uses to develop a unified estimation framework. He deals with models in which the random effects (or measurement errors) have a general parametric distribution, whereas the random regression coefficients (or unobserved predictor variables) and error terms have nonparametric distributions. He proposes a second-order least squares estimator and a simulation-based estimator based on the first two moments of the conditional response variable given the observed covariates. He shows that both estimators are consistent and asymptotically normally distributed under fairly general conditions. The author also reports Monte Carlo simulation studies showing that the proposed estimators perform satisfactorily for relatively small sample sizes. Compared to the likelihood approach, the proposed methods are computationally feasible and do not rely on the normality assumption for random effects or other variables in the model.  相似文献   

5.
6.
Biao Zhang 《Statistics》2016,50(5):1173-1194
Missing covariate data occurs often in regression analysis. We study methods for estimating the regression coefficients in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866] on regression analyses with missing covariates, in which they pioneered the use of two working models, the working propensity score model and the working conditional score model. A recent approach to missing covariate data analysis is the empirical likelihood method of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503], which effectively combines unbiased estimating equations. In this paper, we consider an alternative likelihood approach based on the full likelihood of the observed data. This full likelihood-based method enables us to generate estimators for the vector of the regression coefficients that are (a) asymptotically equivalent to those of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the working propensity score model is correctly specified, and (b) doubly robust, like the augmented inverse probability weighting (AIPW) estimators of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Am Statist Assoc. 1994;89:846–866]. Thus, the proposed full likelihood-based estimators improve on the efficiency of the AIPW estimators when the working propensity score model is correct but the working conditional score model is possibly incorrect, and also improve on the empirical likelihood estimators of Qin, Zhang and Leung [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the reverse is true, that is, the working conditional score model is correct but the working propensity score model is possibly incorrect. In addition, we consider a regression method for estimation of the regression coefficients when the working conditional score model is correctly specified; the asymptotic variance of the resulting estimator is no greater than the semiparametric variance bound characterized by the theory of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866]. Finally, we compare the finite-sample performance of various estimators in a simulation study.  相似文献   

7.
Summary.  There is a large literature on methods of analysis for randomized trials with noncompliance which focuses on the effect of treatment on the average outcome. The paper considers evaluating the effect of treatment on the entire distribution and general functions of this effect. For distributional treatment effects, fully non-parametric and fully parametric approaches have been proposed. The fully non-parametric approach could be inefficient but the fully parametric approach is not robust to the violation of distribution assumptions. We develop a semiparametric instrumental variable method based on the empirical likelihood approach. Our method can be applied to general outcomes and general functions of outcome distributions and allows us to predict a subject's latent compliance class on the basis of an observed outcome value in observed assignment and treatment received groups. Asymptotic results for the estimators and likelihood ratio statistic are derived. A simulation study shows that our estimators of various treatment effects are substantially more efficient than the currently used fully non-parametric estimators. The method is illustrated by an analysis of data from a randomized trial of an encouragement intervention to improve adherence to prescribed depression treatments among depressed elderly patients in primary care practices.  相似文献   

8.
The variance of short-term systematic measurement errors for the difference of paired data is estimated. The difference of paired data is determined by subtracting the measurement results of two methods, which measure the same item only once without measurement repetition. The unbiased estimators for short-term systematic measurement error variances based on the one-way random effects model are not fit for practical purpose because they can be negative. The estimators, which are derived for balanced data as well as for unbalanced data, are always positive but biased. The basis of these positive estimators is the one-way random effects model. The biases, variances, and the mean squared errors of the positive estimators are derived as well as their estimators. The positive estimators are fit for practical purpose.  相似文献   

9.
Al though mixtures form a rich class of probability models, they often present difficulties for statistical inference. Likelihood functions are sometimes unbounded at certain values of the parameters, and densities often have no closed form. These features complicate hoth maximum-likelihood estimation and tests of fit based on the empirical distribution function. New inferential methods using sample characteristic functions (Cfs) and moment generating functions (MGFs) seem well-suited to mixtures. since these transforms often take simple form/ This paper reports a simulation study of the properties of estimators and tests of fit based on CFs, MGFs, and sample moments when applied to three specific families of thick tailed mixture distributios.  相似文献   

10.
ABSTRACT

This article investigates the finite sample properties of a range of inference methods for propensity score-based matching and weighting estimators frequently applied to evaluate the average treatment effect on the treated. We analyze both asymptotic approximations and bootstrap methods for computing variances and confidence intervals in our simulation designs, which are based on German register data and U.S. survey data. We vary the design w.r.t. treatment selectivity, effect heterogeneity, share of treated, and sample size. The results suggest that in general, theoretically justified bootstrap procedures (i.e., wild bootstrapping for pair matching and standard bootstrapping for “smoother” treatment effect estimators) dominate the asymptotic approximations in terms of coverage rates for both matching and weighting estimators. Most findings are robust across simulation designs and estimators.  相似文献   

11.
Most of the research work in the theory of survey sampling only deals with the sampling errors under the assumptions: (i) there is a complete response and (ii) recorded information from individuals is correct but in practice it is not always true. Non-sampling errors like non-response and measurement errors (MEs) mostly creep into the survey and become more influential for estimators than sampling errors. Considering this practical situation of non-response and MEs jointly, we proposed an optimum class of estimators for population mean under simple random sampling using conventional and non-conventional measures. Bias and mean square error of the proposed estimators are derived up to first degree of approximation. Moreover, a simulation study is conducted to assess the performance of new estimators which proves that proposed estimators are more efficient than the traditional Hansen and Hurwitz estimator and other competing estimators.  相似文献   

12.
Small area estimation (SAE) concerns with how to reliably estimate population quantities of interest when some areas or domains have very limited samples. This is an important issue in large population surveys, because the geographical areas or groups with only small samples or even no samples are often of interest to researchers and policy-makers. For example, large population health surveys, such as Behavioural Risk Factor Surveillance System and Ohio Mecaid Assessment Survey (OMAS), are regularly conducted for monitoring insurance coverage and healthcare utilization. Classic approaches usually provide accurate estimators at the state level or large geographical region level, but they fail to provide reliable estimators for many rural counties where the samples are sparse. Moreover, a systematic evaluation of the performances of the SAE methods in real-world setting is lacking in the literature. In this paper, we propose a Bayesian hierarchical model with constraints on the parameter space and show that it provides superior estimators for county-level adult uninsured rates in Ohio based on the 2012 OMAS data. Furthermore, we perform extensive simulation studies to compare our methods with a collection of common SAE strategies, including direct estimators, synthetic estimators, composite estimators, and Datta GS, Ghosh M, Steorts R, Maples J.'s [Bayesian benchmarking with applications to small area estimation. Test 2011;20(3):574–588] Bayesian hierarchical model-based estimators. To set a fair basis for comparison, we generate our simulation data with characteristics mimicking the real OMAS data, so that neither model-based nor design-based strategies use the true model specification. The estimators based on our proposed model are shown to outperform other estimators for small areas in both simulation study and real data analysis.  相似文献   

13.
We consider multi-center experiments (for determining a consensus value) conducted in possibly heterogeneous set-ups leading to unbalanced heteroscedastic one-way random effects models. When normality of both the random components and their homoscedasticity are in doubt, standard statistical methods may not be valid. Two robust R-estimators (for the common location parameter), based on signed-rank statistics, are proposed and their properties studied. When large heteroscedasticity is present or the distribution of random effect is abnormal, the proposed estimators perform better than the classical weighted least squares and selected estimators. This feature is illustrated with an arsenic in oyster tissue problem, along with some other simulation studies.  相似文献   

14.
The zero-inflated Poisson regression model is commonly used when analyzing economic data that come in the form of non-negative integers since it accounts for excess zeros and overdispersion of the dependent variable. However, a problem often encountered when analyzing economic data that has not been addressed for this model is multicollinearity. This paper proposes ridge regression (RR) estimators and some methods for estimating the ridge parameter k for a non-negative model. A simulation study has been conducted to compare the performance of the estimators. Both mean squared error and mean absolute error are considered as the performance criteria. The simulation study shows that some estimators are better than the commonly used maximum-likelihood estimator and some other RR estimators. Based on the simulation study and an empirical application, some useful estimators are recommended for practitioners.  相似文献   

15.
A simulation study of the binomial-logit model with correlated random effects is carried out based on the generalized linear mixed model (GLMM) methodology. Simulated data with various numbers of regression parameters and different values of the variance component are considered. The performance of approximate maximum likelihood (ML) and residual maximum likelihood (REML) estimators is evaluated. For a range of true parameter values, we report the average biases of estimators, the standard error of the average bias and the standard error of estimates over the simulations. In general, in terms of bias, the two methods do not show significant differences in estimating regression parameters. The REML estimation method is slightly better in reducing the bias of variance component estimates.  相似文献   

16.
For the lifetime (or negative) exponential distribution, the trimmed likelihood estimator has been shown to be explicit in the form of a β‐trimmed mean which is representable as an estimating functional that is both weakly continuous and Fréchet differentiable and hence qualitatively robust at the parametric model. It also has high efficiency at the model. The robustness is in contrast to the maximum likelihood estimator (MLE) involving the usual mean which is not robust to contamination in the upper tail of the distribution. When there is known right censoring, it may be perceived that the MLE which is the most asymptotically efficient estimator may be protected from the effects of ‘outliers’ due to censoring. We demonstrate that this is not the case generally, and in fact, based on the functional form of the estimators, suggest a hybrid defined estimator that incorporates the best features of both the MLE and the β‐trimmed mean. Additionally, we study the pure trimmed likelihood estimator for censored data and show that it can be easily calculated and that the censored observations are not always trimmed. The different trimmed estimators are compared by a modest simulation study.  相似文献   

17.
Data Augmentation(DA)插补法是最常用的MCMC多重插补法之一。利用模拟方法研究基于DA插补法的线性回归模型的系数估计值,分析估计值的统计性质受无回答机制、无回答率和插补重数的影响。模拟结果显示:在完全随机无回答机制下,选择较小插补重数常常会得到较好的回归系数估计值;在随机无回答机制下,随着无回答率增大而选择更大插补重数往往会得到更好的回归系数估计值;在非随机无回答机制下,选择更大插补重数并不一定总会得到更好的回归系数估计值。  相似文献   

18.
This article examines the sequential, full information maximum likelihood (FIML), and linearized maximum likelihood (LML) estimators for a nested logit model of time-of-day choice for work trips. These estimators are compared using a Monte Carlo study based on specification and data from a previously published empirical study. The sequential estimator is found to be much less efficient than LML or FIML, and its uncorrected second-stage standard-error estimates are strongly downward biased. LML is only slightly less efficient than FIML, but it is often easier to compute. There are cases in which the sequential and LML estimators do not exist, but FIML still performs well.  相似文献   

19.
In this paper, we consider the non-penalty shrinkage estimation method of random effect models with autoregressive errors for longitudinal data when there are many covariates and some of them may not be active for the response variable. In observational studies, subjects are followed over equally or unequally spaced visits to determine the continuous response and whether the response is associated with the risk factors/covariates. Measurements from the same subject are usually more similar to each other and thus are correlated with each other but not with observations of other subjects. To analyse this data, we consider a linear model that contains both random effects across subjects and within-subject errors that follows autoregressive structure of order 1 (AR(1)). Considering the subject-specific random effect as a nuisance parameter, we use two competing models, one includes all the covariates and the other restricts the coefficients based on the auxiliary information. We consider the non-penalty shrinkage estimation strategy that shrinks the unrestricted estimator in the direction of the restricted estimator. We discuss the asymptotic properties of the shrinkage estimators using the notion of asymptotic biases and risks. A Monte Carlo simulation study is conducted to examine the relative performance of the shrinkage estimators with the unrestricted estimator when the shrinkage dimension exceeds two. We also numerically compare the performance of the shrinkage estimators to that of the LASSO estimator. A longitudinal CD4 cell count data set will be used to illustrate the usefulness of shrinkage and LASSO estimators.  相似文献   

20.
Linear regression models are useful statistical tools to analyze data sets in different fields. There are several methods to estimate the parameters of a linear regression model. These methods usually perform under normally distributed and uncorrelated errors. If error terms are correlated the Conditional Maximum Likelihood (CML) estimation method under normality assumption is often used to estimate the parameters of interest. The CML estimation method is required a distributional assumption on error terms. However, in practice, such distributional assumptions on error terms may not be plausible. In this paper, we propose to estimate the parameters of a linear regression model with autoregressive error term using Empirical Likelihood (EL) method, which is a distribution free estimation method. A small simulation study is provided to evaluate the performance of the proposed estimation method over the CML method. The results of the simulation study show that the proposed estimators based on EL method are remarkably better than the estimators obtained from CML method in terms of mean squared errors (MSE) and bias in almost all the simulation configurations. These findings are also confirmed by the results of the numerical and real data examples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号