期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bias correction through filtering omitted variables and instruments

Andrea Beccarini 《Journal of applied statistics》2016,43(4):754-766

This paper proposes a combination of the particle-filter-based method and the expectation-maximization algorithm (PFEM), in order to filter unobservable variables and hence, to reduce the omitted variables bias. Furthermore, I consider as an unobservable variable, an exogenous one that can be used as an instrument in the instrumental variable (IV) methodology. The aim is to show that the PFEM is able to eliminate or reduce both the omitted variable bias and the simultaneous equation bias by filtering the omitted variable and the unobserved instrument, respectively. In other words, the procedure provides (at least approximately) consistent estimates, without using additional information embedded in the omitted variable or in the instruments, since they are filtered by the observable variables. The validity of the procedure is shown both through simulations and through a comparison to an IV analysis which appeared in an important previous publication. As regards the latter point, I demonstrate that the procedure developed in this article yields similar results to those of the original IV analysis. 相似文献

2.

Some recent advances in measurement error models and methods

Hans Schneeweiss Thomas Augustin 《Allgemeines Statistisches Archiv》2006,90(1):183-197

Summary A measurement error model is a regression model with (substantial) measurement errors in the variables. Disregarding these measurement errors in estimating the regression parameters results in asymptotically biased estimators. Several methods have been proposed to eliminate, or at least to reduce, this bias, and the relative efficiency and robustness of these methods have been compared. The paper gives an account of these endeavors. In another context, when data are of a categorical nature, classification errors play a similar role as measurement errors in continuous data. The paper also reviews some recent advances in this field. This work was supported by the Deutsche Forschungsgemeinschaft (DFG) within the frame of the Sonderforschungsbereich SFB 386. We thank two anonymous referees for their helpful comments. 相似文献

3.

BIAS REDUCTION AND ELIMINATION WITH KERNEL ESTIMATORS

《统计学通讯:理论与方法》2013,42(8-9):1869-1888

A great deal of research has focused on improving the bias properties of kernel estimators. One proposal involves removing the restriction of non-negativity on the kernel to construct “higher-order” kernels that eliminate additional terms in the Taylor's series expansion of the bias. This paper considers an alternative that uses a local approach to bandwidth selection to not only reduce the bias, but to eliminate it entirely. These so-called “zero-bias bandwidths” are shown to exist for univariate and multivariate kernel density estimation as well as kernel regression. Implications of the existence of such bandwidths are discussed. An estimation strategy is presented, and the extent of the reduction or elimination of bias in practice is studied through simulation and example. 相似文献

4.

随机效应半参数logit模型的惩罚似然估计研究

下载免费PDF全文

孙燕《统计研究》2013,30(4):92-98

在颇具争议的收入差距和健康关系研究中,为了降低可能存在的模型设定和遗漏变量偏误,本文提出了随机效应半参数logit模型,其中非参数的设定还可用于数据的初探性分析。随后本文提出了模型非参数和参数部分的估计方法。这里涉及的难点是随机效应的存在导致似然函数中的积分没有解析式,而非参数的存在更加大了估计难度。本文基于惩罚样条非参数估计方法和四阶Laplace近似方法建立了惩罚对数似然函数,其最大化采用了Newton_Raphson近似方法。文章还建立了惩罚样条中重要光滑参数的选取准则。模型在收入差距和健康实例中的估计结果表明数据支持收入差距弱假说,且非参数估计结果表明其具有U型形式,与实例估计结果的比较指出本文提出的估计方法是较准确的。相似文献

5.

Variance estimation based on blocked 3×2 cross-validation in high-dimensional linear regression

Xingli Yang Yu Wang Wennan Yan Jihong Li 《Journal of applied statistics》2021,48(11):1934

In high-dimensional linear regression, the dimension of variables is always greater than the sample size. In this situation, the traditional variance estimation technique based on ordinary least squares constantly exhibits a high bias even under sparsity assumption. One of the major reasons is the high spurious correlation between unobserved realized noise and several predictors. To alleviate this problem, a refitted cross-validation (RCV) method has been proposed in the literature. However, for a complicated model, the RCV exhibits a lower probability that the selected model includes the true model in case of finite samples. This phenomenon may easily result in a large bias of variance estimation. Thus, a model selection method based on the ranks of the frequency of occurrences in six votes from a blocked 3×2 cross-validation is proposed in this study. The proposed method has a considerably larger probability of including the true model in practice than the RCV method. The variance estimation obtained using the model selected by the proposed method also shows a lower bias and a smaller variance. Furthermore, theoretical analysis proves the asymptotic normality property of the proposed variance estimation. 相似文献

6.

Structural Equation Models for Dealing With Spatial Confounding

Hauke Thaden 《The American statistician》2018,72(3):239-252

In regression analyses of spatially structured data, it is common practice to introduce spatially correlated random effects into the regression model to reduce or even avoid unobserved variable bias in the estimation of other covariate effects. If besides the response the covariates are also spatially correlated, the spatial effects may confound the effect of the covariates or vice versa. In this case, the model fails to identify the true covariate effect due to multicollinearity. For highly collinear continuous covariates, path analysis and structural equation modeling techniques prove to be helpful to disentangle direct covariate effects from indirect covariate effects arising from correlation with other variables. This work discusses the applicability of these techniques in regression setups, where spatial and covariate effects coincide at least partly and classical geoadditive models fail to separate these effects. Supplementary materials for this article are available online. 相似文献

7.

A consistent simulation-based estimator in generalized linear mixed models

《Journal of Statistical Computation and Simulation》2012,82(8):1085-1103

We propose a strongly root-n consistent simulation-based estimator for the generalized linear mixed models. This estimator is constructed based on the first two marginal moments of the response variables, and it allows the random effects to have any parametric distribution (not necessarily normal). Consistency and asymptotic normality for the proposed estimator are derived under fairly general regularity conditions. We also demonstrate that this estimator has a bounded influence function and that it is robust against data outliers. A bias correction technique is proposed to reduce the finite sample bias in the estimation of variance components. The methodology is illustrated through an application to the famed seizure count data and some simulation studies. 相似文献

8.

Estimating the variance for heterogeneity in arm‐based network meta‐analysis

下载免费PDF全文

Hans‐Peter Piepho Laurence V. Madden James Roger Roger Payne Emlyn R. Williams 《Pharmaceutical statistics》2018,17(3):264-277

Network meta‐analysis can be implemented by using arm‐based or contrast‐based models. Here we focus on arm‐based models and fit them using generalized linear mixed model procedures. Full maximum likelihood (ML) estimation leads to biased trial‐by‐treatment interaction variance estimates for heterogeneity. Thus, our objective is to investigate alternative approaches to variance estimation that reduce bias compared with full ML. Specifically, we use penalized quasi‐likelihood/pseudo‐likelihood and hierarchical (h) likelihood approaches. In addition, we consider a novel model modification that yields estimators akin to the residual maximum likelihood estimator for linear mixed models. The proposed methods are compared by simulation, and 2 real datasets are used for illustration. Simulations show that penalized quasi‐likelihood/pseudo‐likelihood and h‐likelihood reduce bias and yield satisfactory coverage rates. Sum‐to‐zero restriction and baseline contrasts for random trial‐by‐treatment interaction effects, as well as a residual ML‐like adjustment, also reduce bias compared with an unconstrained model when ML is used, but coverage rates are not quite as good. Penalized quasi‐likelihood/pseudo‐likelihood and h‐likelihood are therefore recommended. 相似文献

9.

Secondary design considerations for minimum bias estimation

A.I. Khuri J.A. Cornell 《统计学通讯:理论与方法》2013,42(7):631-647

Several authors have suggested the method of minimum bias estimation for estimating response surfaces. The minimum bias estimation procedure achieves minimum average squared bias of the fitted model without depending on the values of the unknown parameters of the true surface. The only requirement is that the design satisfies a simple estimability condition. Subject to providing minimum average squared bias, the minimum bias estimator also provides minimum average variance of ?(x) where ?(x) is the estimate of the response at the point x.

To support the estimation of the parameters in the fitted model, very little has been suggested in the way of experimental designs except to say that a full rank matrix X of independent variables should be used. This paper presents a closer look at the estimability conditions that are required for minimum bias estimation, and from the form of the matrix X, a formula is derived which measures the amount of design flexibility available. The design flexibility is termed “the degrees of freedom” of the X matrix and it is shown how the degrees of freedom can be used to decide if other design optimality criteria might be considered along with minimum bias estimation. Several examples are provided. 相似文献

10.

Illuminate the unknown: evaluation of imputation procedures based on the SAVE survey 总被引：1，自引：0，他引：1

Michael Ziegelmeyer 《AStA Advances in Statistical Analysis》2013,97(1):49-76

Questions about monetary variables (such as income, wealth or savings) are key components of questionnaires on household finances. However, missing information on such sensitive topics is a well-known phenomenon which can seriously bias any inference based only on complete-case analysis. Many imputation techniques have been developed and implemented in several surveys. Using the German SAVE data, a new estimation technique is necessary to overcome the upward bias of monetary variables caused by the initially implemented imputation procedure. The upward bias is the result of adding random draws to the implausible negative values predicted by OLS regressions until all values are positive. To overcome this problem the logarithm of the dependent variable is taken and the predicted values are retransformed to the original scale by Duan’s smearing estimate. This paper evaluates the two different techniques for the imputation of monetary variables implementing a simulation study, where a random pattern of missingness is imposed on the observed values of the variables of interest. A Monte-Carlo simulation based on the observed data shows the superiority of the newly implemented smearing estimate to construct the missing data structure. All waves are consistently imputed using the new method. 相似文献

11.

一种新的空间权重矩阵选择方法 总被引：1，自引：0，他引：1

下载免费PDF全文

任英华游万海《统计研究》2012,29(6):99-105

空间权重矩阵选择问题一直是空间计量经济学中的一个难题,权重矩阵的选择正确与否关系到模型的最终估计结果。本文在空间滞后模型框架下,把空间权重矩阵选择问题转化为变量选择问题,然后利用CWB方法进行变量选择。中国城市服务业集聚机理实证研究显示,利用本文所提出的方法所选取的空间权重矩阵较为合理,进而可以减少因为空间权重矩阵误设问题而引起的模型估计偏误。在大样本情形下,该方法可以非常有效地降低计算成本。相似文献

12.

Simulation-based consistent inference for biased working model of non-sparse high-dimensional linear regression

Lu Lin Feng LiLixing Zhu 《Journal of statistical planning and inference》2011,141(12):3780-3792

Variable selection in regression analysis is of importance because it can simplify model and enhance predictability. After variable selection, however, the resulting working model may be biased when it does not contain all of significant variables. As a result, the commonly used parameter estimation is either inconsistent or needs estimating high-dimensional nuisance parameter with very strong assumptions for consistency, and the corresponding confidence region is invalid when the bias is relatively large. We in this paper introduce a simulation-based procedure to reformulate a new model so as to reduce the bias of the working model, with no need to estimate high-dimensional nuisance parameter. The resulting estimators of the parameters in the working model are asymptotic normally distributed whether the bias is small or large. Furthermore, together with the empirical likelihood, we build simulation-based confidence regions for the parameters in the working model. The newly proposed estimators and confidence regions outperform existing ones in the sense of consistency. 相似文献

13.

A sufficient condition for the MSE dominance of the positive-part shrinkage estimator when each individual regression coefficient is estimated in a misspecified linear regression model

Akio Namba 《Journal of Statistical Computation and Simulation》2018,88(11):2034-2047

In this paper, assuming that there exist omitted explanatory variables in the specified model, we derive the exact formula for the mean squared error (MSE) of a general family of shrinkage estimators for each individual regression coefficient. It is shown analytically that when our concern is to estimate each individual regression coefficient, the positive-part shrinkage estimators have smaller MSE than the original shrinkage estimators under some conditions even when the relevant regressors are omitted. Also, by numerical evaluations, we showed the effects of our theorem for several specific cases. It is shown that the positive-part shrinkage estimators have smaller MSE than the original shrinkage estimators for wide region of parameter space even when there exist omitted variables in the specified model. 相似文献

14.

Inclusion of binary proxy variables in logistic regression improves treatment effect estimation in observational studies in the presence of binary unmeasured confounding variables

Cornelius Rosenbaum Qingzhao Yu Sarah Buzhardt Elizabeth Sutton Andrew G. Chapple 《Pharmaceutical statistics》2023,22(6):995-1015

We present a simulation study and application that shows inclusion of binary proxy variables related to binary unmeasured confounders improves the estimate of a related treatment effect in binary logistic regression. The simulation study included 60,000 randomly generated parameter scenarios of sample size 10,000 across six different simulation structures. We assessed bias by comparing the probability of finding the expected treatment effect relative to the modeled treatment effect with and without the proxy variable. Inclusion of a proxy variable in the logistic regression model significantly reduced the bias of the treatment or exposure effect when compared to logistic regression without the proxy variable. Including proxy variables in the logistic regression model improves the estimation of the treatment effect at weak, moderate, and strong association with unmeasured confounders and the outcome, treatment, or proxy variables. Comparative advantages held for weakly and strongly collapsible situations, as the number of unmeasured confounders increased, and as the number of proxy variables adjusted for increased. 相似文献

15.

Inferences from biased samples with a memory effect

B. Wang J. Sun 《Journal of statistical planning and inference》2009

Biased sampling occurs often in observational studies. With one biased sample, the problem of nonparametrically estimating both a target density function and a selection bias function is unidentifiable. This paper studies the nonparametric estimation problem when there are two biased samples that have some overlapping observations (i.e. recaptures) from a finite population. Since an intelligent subject sampled previously may experience a memory effect if sampled again, two general 2-stage models that incorporate both a selection bias and a possible memory effect are proposed. Nonparametric estimators of the target density, selection bias, and memory functions, as well as the population size are developed. Asymptotic properties of these estimators are studied and confidence bands for the selection function and memory function are provided. Our procedures are compared with those ignoring the memory effect or the selection bias in finite sample situations. A nonparametric model selection procedure is also given for choosing a model from the two 2-stage models and a mixture of these two models. Our procedures work well with or without a memory effect, and with or without a selection bias. The paper concludes with an application to a real survey data set. 相似文献

16.

Improving power posterior estimation of statistical evidence

Nial Friel Merrilee Hurn Jason Wyse 《Statistics and Computing》2014,24(5):709-723

The statistical evidence (or marginal likelihood) is a key quantity in Bayesian statistics, allowing one to assess the probability of the data given the model under investigation. This paper focuses on refining the power posterior approach to improve estimation of the evidence. The power posterior method involves transitioning from the prior to the posterior by powering the likelihood by an inverse temperature. In common with other tempering algorithms, the power posterior involves some degree of tuning. The main contributions of this article are twofold—we present a result from the numerical analysis literature which can reduce the bias in the estimate of the evidence by addressing the error arising from numerically integrating across the inverse temperatures. We also tackle the selection of the inverse temperature ladder, applying this approach additionally to the Stepping Stone sampler estimation of evidence. A key practical point is that both of these innovations incur virtually no extra cost. 相似文献

17.

Interregional Price Difference in the New Orleans Auctions Market for Slaves

《商业与经济统计学杂志》2013,31(4):486-509

Our article investigates the variation of winning bids in slave auctions held in New Orleans from 1804 to 1862. Specifically, we measure the variation in the price of slaves conditional on their geographical origin. Previous work using a regression framework ignored the auction mechanism used to sell slaves. This introduced a bias in the conditional mean of the winning bid because it depended on the number of bidders participating in the auction. Unfortunately, the number of bidders is unobserved by the econometrician.We adopt the standard framework of a symmetric independent private value auction and propose an estimation strategy to attempt to deal with this omitted variable bias. Our estimate of the mean number of bidders doubled from 1804 to 1862. We find the number of bidders had a significant positive effect on the average winning bid. An increase from 20 to 30 bidders in an auction raised the average winning bid by around 10%%. The price variation according to the geographical origin of slaves found in earlier work continues to persist after accounting for the omitted variable. We also find a new result that a considerable premium is paid for slaves originating from New Orleans. However, this price variation disappears once we account for regional productivity differences. 相似文献

18.

A risk set calibration method for failure time regression by using a covariate reliability sample

Sharon X. Xie C. Y. Wang & Ross L. Prentice 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2001,63(4):855-870

Regression parameter estimation in the Cox failure time model is considered when regression variables are subject to measurement error. Assuming that repeat regression vector measurements adhere to a classical measurement model, we can consider an ordinary regression calibration approach in which the unobserved covariates are replaced by an estimate of their conditional expectation given available covariate measurements. However, since the rate of withdrawal from the risk set across the time axis, due to failure or censoring, will typically depend on covariates, we may improve the regression parameter estimator by recalibrating within each risk set. The asymptotic and small sample properties of such a risk set regression calibration estimator are studied. A simple estimator based on a least squares calibration in each risk set appears able to eliminate much of the bias that attends the ordinary regression calibration estimator under extreme measurement error circumstances. Corresponding asymptotic distribution theory is developed, small sample properties are studied using computer simulations and an illustration is provided. 相似文献

19.

Nondifferentiable errors in beta-compliance integrated logistic models: numerical results

Mario Chen-Mok Pranab K. Sen 《统计学通讯:模拟与计算》2013,42(4):1149-1164

In dose-response models, there are cases where only a portion of the administered dose may have an effect. This results in a stochastic compliance of the administered dose. In a previous paper (Chen-Mok and Sen, 1999), we developed suitable adjustments for compliance in the logistic model under the assumption of nondifferential measurement error. These compliance-adjusted models were categorized into three types: (i) Low (or near zero) dose levels, (ii) moderate dose levels, and (iii) high dose levels. In this paper, we analyze a set of data on the atomic bomb survivors of Japan to illustrate the use of the proposed methods. In addition, we examine the performance of these methods under different conditions based on a simulation study. Among all three cases, the adjustments proposed for the moderate dose case do not seem to work adequately. Both bias and variance are larger when using the adjusted model in comparison with the unadjusted model. The adjustments for the low dose case seem to work in reducing the bias in the estimation of the parameters under all types of compliance distributions. The MSEs, however, are larger under some of the compliance distribution considered. Finally, the results of this simulation study show that the adjustments for the high dose case are successful in achieving both a reduction in bias as well as a reduction in MSE, hence the overall efficiency of the estimation is improved. 相似文献

20.

Using categorical markers as auxiliary variables in log‐rank tests and hazard ratio estimation

Todd Mackenzie Michal Abrahamowicz 《Revue canadienne de statistique》2005,33(2):201-219

Markers, which are prognostic longitudinal variables, can be used to replace some of the information lost due to right censoring. They may also be used to remove or reduce bias due to informative censoring. In this paper, the authors propose novel methods for using markers to increase the efficiency of log‐rank tests and hazard ratio estimation, as well as parametric estimation. They propose a «plug‐in» methodology that consists of writing the test statistic or estimate of interest as a functional of Kaplan–Meier estimators. The latter are then replaced by an efficient estimator of the survival curve that incorporates information from markers. Using simulations, the authors show that the resulting estimators and tests can be up to 30% more efficient than the usual procedures, provided that the marker is highly prognostic and that the frequency of censoring is high. 相似文献