期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Correlation and efficiency of propensity score-based estimators for average causal effects

Ronnie Pingel Ingeborg Waernbaum 《统计学通讯:模拟与计算》2017,46(5):3458-3478

Propensity score-based estimators are commonly used to estimate causal effects in evaluation research. To reduce bias in observational studies, researchers might be tempted to include many, perhaps correlated, covariates when estimating the propensity score model. Taking into account that the propensity score is estimated, this study investigates how the efficiency of matching, inverse probability weighting, and doubly robust estimators change under the case of correlated covariates. Propositions regarding the large sample variances under certain assumptions on the data-generating process are given. The propositions are supplemented by several numerical large sample and finite sample results from a wide range of models. The results show that the covariate correlations may increase or decrease the variances of the estimators. There are several factors that influence how correlation affects the variance of the estimators, including the choice of estimator, the strength of the confounding toward outcome and treatment, and whether a constant or non-constant causal effect is present. 相似文献

2.

Inference for proportional hazard model with propensity score

Bo Lu Luheng Wang Xingwei Tong Huiyun Xiang 《统计学通讯:理论与方法》2018,47(12):2908-2918

Since the publication of the seminal paper by Cox (1972), proportional hazard model has become very popular in regression analysis for right censored data. In observational studies, treatment assignment may depend on observed covariates. If these confounding variables are not accounted for properly, the inference based on the Cox proportional hazard model may perform poorly. As shown in Rosenbaum and Rubin (1983), under the strongly ignorable treatment assignment assumption, conditioning on the propensity score yields valid causal effect estimates. Therefore we incorporate the propensity score into the Cox model for causal inference with survival data. We derive the asymptotic property of the maximum partial likelihood estimator when the model is correctly specified. Simulation results show that our method performs quite well for observational data. The approach is applied to a real dataset on the time of readmission of trauma patients. We also derive the asymptotic property of the maximum partial likelihood estimator with a robust variance estimator, when the model is incorrectly specified. 相似文献

3.

Doubly robust empirical likelihood inference in covariate-missing data problems

Biao Zhang 《Statistics》2016,50(5):1173-1194

Missing covariate data occurs often in regression analysis. We study methods for estimating the regression coefficients in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866] on regression analyses with missing covariates, in which they pioneered the use of two working models, the working propensity score model and the working conditional score model. A recent approach to missing covariate data analysis is the empirical likelihood method of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503], which effectively combines unbiased estimating equations. In this paper, we consider an alternative likelihood approach based on the full likelihood of the observed data. This full likelihood-based method enables us to generate estimators for the vector of the regression coefficients that are (a) asymptotically equivalent to those of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the working propensity score model is correctly specified, and (b) doubly robust, like the augmented inverse probability weighting (AIPW) estimators of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Am Statist Assoc. 1994;89:846–866]. Thus, the proposed full likelihood-based estimators improve on the efficiency of the AIPW estimators when the working propensity score model is correct but the working conditional score model is possibly incorrect, and also improve on the empirical likelihood estimators of Qin, Zhang and Leung [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the reverse is true, that is, the working conditional score model is correct but the working propensity score model is possibly incorrect. In addition, we consider a regression method for estimation of the regression coefficients when the working conditional score model is correctly specified; the asymptotic variance of the resulting estimator is no greater than the semiparametric variance bound characterized by the theory of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866]. Finally, we compare the finite-sample performance of various estimators in a simulation study. 相似文献

4.

Multiply robust matching estimators of average and quantile treatment effects

Shu Yang Yunshu Zhang 《Scandinavian Journal of Statistics》2023,50(1):235-265

Propensity score matching has been a long-standing tradition for handling confounding in causal inference, however, requiring stringent model assumptions. In this article, we propose novel double score matching (DSM) utilizing both the propensity score and prognostic score. To gain the protection of possible model misspecification, we posit multiple candidate models for each score. We show that the debiasing DSM estimator achieves the multiple robustness property in that it is consistent if any one of the score models is correctly specified. We characterize the asymptotic distribution for the DSM estimator requiring only one correct model specification based on the martingale representations of the matching estimators and theory for local normal experiments. We also provide a two-stage replication method for variance estimation and extend DSM for quantile estimation. Simulation demonstrates DSM outperforms single-score matching and prevailing multiply robust weighting estimators in the presence of extreme propensity scores. 相似文献

5.

Effects of model misspecification on tests of no randomized treatment effect arising from Cox's proportional hazards model

A. G. DiRienzo & S. W. Lagakos 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2001,63(4):745-757

We examine the asymptotic and small sample properties of model-based and robust tests of the null hypothesis of no randomized treatment effect based on the partial likelihood arising from an arbitrarily misspecified Cox proportional hazards model. When the distribution of the censoring variable is either conditionally independent of the treatment group given covariates or conditionally independent of covariates given the treatment group, the numerators of the partial likelihood treatment score and Wald tests have asymptotic mean equal to 0 under the null hypothesis, regardless of whether or how the Cox model is misspecified. We show that the model-based variance estimators used in the calculation of the model-based tests are not, in general, consistent under model misspecification, yet using analytic considerations and simulations we show that their true sizes can be as close to the nominal value as tests calculated with robust variance estimators. As a special case, we show that the model-based log-rank test is asymptotically valid. When the Cox model is misspecified and the distribution of censoring depends on both treatment group and covariates, the asymptotic distributions of the resulting partial likelihood treatment score statistic and maximum partial likelihood estimator do not, in general, have a zero mean under the null hypothesis. Here neither the fully model-based tests, including the log-rank test, nor the robust tests will be asymptotically valid, and we show through simulations that the distortion to test size can be substantial. 相似文献

6.

Multiple robustness estimation in causal inference

Lei Wang 《统计学通讯:理论与方法》2013,42(23):5701-5718

Abstract

Estimation of average treatment effect is crucial in causal inference for evaluation of treatments or interventions in biostatistics, epidemiology, econometrics, sociology. However, existing estimators require either a propensity score model, an outcome vector model, or both is correctly specified, which is difficult to verify in practice. In this paper, we allow multiple models for both the propensity score models and the outcome models, and then construct a weighting estimator based on observed data by using two-sample empirical likelihood. The resulting estimator is consistent if any one of those multiple models is correctly specified, and thus provides multiple protection on consistency. Moreover, the proposed estimator can attain the semiparametric efficiency bound when one propensity score model and one outcome vector model are correctly specified, without requiring knowledge of which models are correct. Simulations are performed to evaluate the finite sample performance of the proposed estimators. As an application, we analyze the data collected from the AIDS Clinical Trials Group Protocol 175. 相似文献

7.

INDEX TO VOLUME 17 (1999)

Xun Lu 《商业与经济统计学杂志》2013,31(4):506-507

We study how to select or combine estimators of the average treatment effect (ATE) and the average treatment effect on the treated (ATT) in the presence of multiple sets of covariates. We consider two cases: (1) all sets of covariates satisfy the unconfoundedness assumption and (2) some sets of covariates violate the unconfoundedness assumption locally. For both cases, we propose a data-driven covariate selection criterion (CSC) to minimize the asymptotic mean squared errors (AMSEs). Based on our CSC, we propose new average estimators of ATE and ATT, which include the selected estimators based on a single set of covariates as a special case. We derive the asymptotic distributions of our new estimators and propose how to construct valid confidence intervals. Our Monte Carlo simulations show that in finite samples, our new average estimators achieve substantial efficiency gains over the estimators based on a single set of covariates. We apply our new estimators to study the impact of inherited control on firm performance. 相似文献

8.

On Local Polynomial Modelling of the Additive Risk Model

Wanrong Liu Yongcheng Qi 《统计学通讯:理论与方法》2013,42(11):1958-1981

The additive risk model provides an alternative modelling technique for failure time data to the proportional hazards model. In this article, we consider the additive risk model with a nonparametric risk effect. We study estimation of the risk function and its derivatives with a parametric and an unspecified baseline hazard function respectively. The resulting estimators are the local likelihood and the local score estimators. We establish the asymptotic normality of the estimators and show that both methods have the same formula for asymptotic bias but different formula for variance. It is found that, in some special cases, the local score estimator is of the same efficiency as the local likelihood estimator though it does not use the information about the baseline hazard function. Another advantage of the local score estimator is that it has a closed form and is easy to implement. Some simulation studies are conducted to evaluate and compare the performance of the two estimators. A numerical example is used for illustration. 相似文献

9.

Non-penalty shrinkage estimation of random effect models for longitudinal data with AR(1) errors

Le An Lac 《Journal of Statistical Computation and Simulation》2018,88(16):3230-3247

In this paper, we consider the non-penalty shrinkage estimation method of random effect models with autoregressive errors for longitudinal data when there are many covariates and some of them may not be active for the response variable. In observational studies, subjects are followed over equally or unequally spaced visits to determine the continuous response and whether the response is associated with the risk factors/covariates. Measurements from the same subject are usually more similar to each other and thus are correlated with each other but not with observations of other subjects. To analyse this data, we consider a linear model that contains both random effects across subjects and within-subject errors that follows autoregressive structure of order 1 (AR(1)). Considering the subject-specific random effect as a nuisance parameter, we use two competing models, one includes all the covariates and the other restricts the coefficients based on the auxiliary information. We consider the non-penalty shrinkage estimation strategy that shrinks the unrestricted estimator in the direction of the restricted estimator. We discuss the asymptotic properties of the shrinkage estimators using the notion of asymptotic biases and risks. A Monte Carlo simulation study is conducted to examine the relative performance of the shrinkage estimators with the unrestricted estimator when the shrinkage dimension exceeds two. We also numerically compare the performance of the shrinkage estimators to that of the LASSO estimator. A longitudinal CD4 cell count data set will be used to illustrate the usefulness of shrinkage and LASSO estimators. 相似文献

10.

Matching Using Sufficient Dimension Reduction for Causal Inference

Wei Luo 《商业与经济统计学杂志》2020,38(4):888-900

ABSTRACT

To estimate causal treatment effects, we propose a new matching approach based on the reduced covariates obtained from sufficient dimension reduction. Compared with the original covariates and the propensity score, which are commonly used for matching in the literature, the reduced covariates are nonparametrically estimable and are effective in imputing the missing potential outcomes, under a mild assumption on the low-dimensional structure of the data. Under the ignorability assumption, the consistency of the proposed approach requires a weaker common support condition. In addition, researchers are allowed to employ different reduced covariates to find matched subjects for different treatment groups. We develop relevant asymptotic results and conduct simulation studies as well as real data analysis to illustrate the usefulness of the proposed approach. 相似文献

11.

Propensity score model specification for estimation of average treatment effects

Ingeborg Waernbaum 《Journal of statistical planning and inference》2010

Treatment effect estimators that utilize the propensity score as a balancing score, e.g., matching and blocking estimators are robust to misspecifications of the propensity score model when the misspecification is a balancing score. Such misspecifications arise from using the balancing property of the propensity score in the specification procedure. Here, we study misspecifications of a parametric propensity score model written as a linear predictor in a strictly monotonic function, e.g. a generalized linear model representation. Under mild assumptions we show that for misspecifications, such as not adding enough higher order terms or choosing the wrong link function, the true propensity score is a function of the misspecified model. Hence, the latter does not bring bias to the treatment effect estimator. It is also shown that a misspecification of the propensity score does not necessarily lead to less efficient estimation of the treatment effect. The results of the paper are highlighted in simulations where different misspecifications are studied. 相似文献

12.

A comparison of doubly robust estimators of the mean with missing data

《Journal of Statistical Computation and Simulation》2012,82(16):3383-3403

We consider data with a continuous outcome that is missing at random and a fully observed set of covariates. We compare by simulation a variety of doubly-robust (DR) estimators for estimating the mean of the outcome. An estimator is DR if it is consistent when either the regression model for the mean function or the propensity to respond is correctly specified. Performance of different methods is compared in terms of root mean squared error of the estimates and width and coverage of confidence intervals or posterior credibility intervals in repeated samples. Overall, the DR methods tended to yield better inference than the incorrect model when either the propensity or mean model is correctly specified, but were less successful for small sample sizes, where the asymptotic DR property is less consequential. Two methods tended to outperform the other DR methods: penalized spline of propensity prediction [Little RJA, An H. Robust likelihood-based analysis of multivariate data with missing values. Statist Sinica. 2004;14:949–968] and the robust method proposed in [Cao W, Tsiatis AA, Davidian M. Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data. Biometrika. 2009;96:723–734]. 相似文献

13.

Comparison of causal effect estimators under exposure misclassification

Manoochehr Babanezhad Stijn Vansteelandt Els Goetghebeur 《Journal of statistical planning and inference》2010

Over the past decades, various principles for causal effect estimation have been proposed, all differing in terms of how they adjust for measured confounders: either via traditional regression adjustment, by adjusting for the expected exposure given those confounders (e.g., the propensity score), or by inversely weighting each subject's data by the likelihood of the observed exposure, given those confounders. When the exposure is measured with error, this raises the question whether these different estimation strategies might be differently affected and whether one of them is to be preferred for that reason. In this article, we investigate this by comparing inverse probability of treatment weighted (IPTW) estimators and doubly robust estimators for the exposure effect in linear marginal structural mean models (MSM) with G-estimators, propensity score (PS) adjusted estimators and ordinary least squares (OLS) estimators for the exposure effect in linear regression models. We find analytically that these estimators are equally affected when exposure misclassification is independent of the confounders, but not otherwise. Simulation studies reveal similar results for time-varying exposures and when the model of interest includes a logistic link. 相似文献

14.

Estimation of average treatment effects based on parametric propensity score model

Lili Yao Zhihua Sun Qihua Wang 《Journal of statistical planning and inference》2010

In this paper, the estimation of average treatment effects is examined given that the propensity score is of a parametric form with some unknown parameters. Under the assumption that the treatment is ignorable given some observed characteristics, the MLEs for those unknown parameters in the probability assignment model have been achieved firstly and then three estimators have been defined by the inverse probability weighted, regression and imputation methods, respectively. All the estimators are shown asymptotically normal and more importantly, the substantial efficiency gains of the first two estimates have been obtained theoretically compared with the existing estimators in Hahn (1998) and Hirano et al. (2003), i.e., the inverse weighted probability estimator and the regression estimator have smaller asymptotic variances. Our simulation analysis verifies the theoretical results in terms of biases, SEs and MSEs. 相似文献

15.

GMM estimation in partial linear models with endogenous covariates causing an over-identified problem

Baicheng Chen Yong Zhou 《统计学通讯:理论与方法》2013,42(11):3168-3184

ABSTRACT

We study partial linear models where the linear covariates are endogenous and cause an over-identified problem. We propose combining the profile principle with local linear approximation and the generalized moment methods (GMM) to estimate the parameters of interest. We show that the profiled GMM estimators are root? n consistent and asymptotically normally distributed. By appropriately choosing the weight matrix, the estimators can attain the efficiency bound. We further consider variable selection by using the moment restrictions imposed on endogenous variables when the dimension of the covariates may be diverging with the sample size, and propose a penalized GMM procedure, which is shown to have the sparsity property. We establish asymptotic normality of the resulting estimators of the nonzero parameters. Simulation studies have been presented to assess the finite-sample performance of the proposed procedure. 相似文献

16.

General purpose multiply robust data integration procedures for handling nonprobability samples

Sixia Chen David Haziza 《Scandinavian Journal of Statistics》2023,50(2):697-724

In recent years, there has been an increased interest in combining probability and nonprobability samples. Nonprobability sample are cheaper and quicker to conduct but the resulting estimators are vulnerable to bias as the participation probabilities are unknown. To adjust for the potential bias, estimation procedures based on parametric or nonparametric models have been discussed in the literature. However, the validity of the resulting estimators relies heavily on the validity of the underlying models. Also, nonparametric approaches may suffer from the curse of dimensionality and poor efficiency. We propose a data integration approach by combining multiple outcome regression models and propensity score models. The proposed approach can be used for estimating general parameters including totals, means, distribution functions, and percentiles. The resulting estimators are multiply robust in the sense that they remain consistent if all but one model are misspecified. The asymptotic properties of point and variance estimators are established. The results from a simulation study show the benefits of the proposed method in terms of bias and efficiency. Finally, we apply the proposed method using data from the Korea National Health and Nutrition Examination Survey and data from the National Health Insurance Sharing Services. 相似文献

17.

Parametric and semiparametric models for recapture and removal studies: a likelihood approach

Kani Chen 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2001,63(3):607-619

Capture–recapture processes are biased samplings of recurrent event processes, which can be modelled by the Andersen–Gill intensity model. The intensity function is assumed to be a function of time, covariates and a parameter. We derive the maximum likelihood estimators of both the parameter and the population size and show the consistency and asymptotic normality of the estimators for both recapture and removal studies. The estimators are asymptotically efficient and their theoretical asymptotic relative efficiencies with respect to the existing estimators of Yip and co-workers can be as large as ∞. The variance estimation and a numerical example are also presented. 相似文献

18.

Estimation for semiparametric varying coefficient models with different smoothing variables under random right censoring

Seong J. Yang 《Journal of the Korean Statistical Society》2018,47(2):161-171

In this paper we study a semiparametric varying coefficient model when the response is subject to random right censoring. The model gives an easy interpretation due to its direct connectivity to the classical linear model and is very flexible since nonparametric functions which accommodates various nonlinear interaction effects between covariates are admitted in the model. We propose estimators for this model using mean-preserving transformation and establish their asymptotic properties. The estimation procedure is based on the profiling and the smooth backfitting techniques. A simulation study is presented to show the reliability of the proposed estimators and an automatic bandwidth selector is given in a data-driven way. 相似文献

19.

Efficient estimation for time series following generalized linear models

下载免费PDF全文

T. Thomson S. Hossain M. Ghahramani 《Australian & New Zealand Journal of Statistics》2016,58(4):493-513

In this paper, we consider James–Stein shrinkage and pretest estimation methods for time series following generalized linear models when it is conjectured that some of the regression parameters may be restricted to a subspace. Efficient estimation strategies are developed when there are many covariates in the model and some of them are not statistically significant. Statistical properties of the pretest and shrinkage estimation methods including asymptotic distributional bias and risk are developed. We investigate the relative performances of shrinkage and pretest estimators with respect to the unrestricted maximum partial likelihood estimator (MPLE). We show that the shrinkage estimators have a lower relative mean squared error as compared to the unrestricted MPLE when the number of significant covariates exceeds two. Monte Carlo simulation experiments were conducted for different combinations of inactive covariates and the performance of each estimator was evaluated in terms of its mean squared error. The practical benefits of the proposed methods are illustrated using two real data sets. 相似文献

20.

Inference on Survival Data with Covariate Measurement Error – An Imputation-based Approach

YI LI LOUISE RYAN 《Scandinavian Journal of Statistics》2006,33(2):169-190

Abstract. We propose a new method for fitting proportional hazards models with error-prone covariates. Regression coefficients are estimated by solving an estimating equation that is the average of the partial likelihood scores based on imputed true covariates. For the purpose of imputation, a linear spline model is assumed on the baseline hazard. We discuss consistency and asymptotic normality of the resulting estimators, and propose a stochastic approximation scheme to obtain the estimates. The algorithm is easy to implement, and reduces to the ordinary Cox partial likelihood approach when the measurement error has a degenerate distribution. Simulations indicate high efficiency and robustness. We consider the special case where error-prone replicates are available on the unobserved true covariates. As expected, increasing the number of replicates for the unobserved covariates increases efficiency and reduces bias. We illustrate the practical utility of the proposed method with an Eastern Cooperative Oncology Group clinical trial where a genetic marker, c- myc expression level, is subject to measurement error. 相似文献