期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Multiple robustness estimation in causal inference

Lei Wang 《统计学通讯:理论与方法》2013,42(23):5701-5718

Abstract

Estimation of average treatment effect is crucial in causal inference for evaluation of treatments or interventions in biostatistics, epidemiology, econometrics, sociology. However, existing estimators require either a propensity score model, an outcome vector model, or both is correctly specified, which is difficult to verify in practice. In this paper, we allow multiple models for both the propensity score models and the outcome models, and then construct a weighting estimator based on observed data by using two-sample empirical likelihood. The resulting estimator is consistent if any one of those multiple models is correctly specified, and thus provides multiple protection on consistency. Moreover, the proposed estimator can attain the semiparametric efficiency bound when one propensity score model and one outcome vector model are correctly specified, without requiring knowledge of which models are correct. Simulations are performed to evaluate the finite sample performance of the proposed estimators. As an application, we analyze the data collected from the AIDS Clinical Trials Group Protocol 175. 相似文献

2.

A comparison of doubly robust estimators of the mean with missing data

《Journal of Statistical Computation and Simulation》2012,82(16):3383-3403

We consider data with a continuous outcome that is missing at random and a fully observed set of covariates. We compare by simulation a variety of doubly-robust (DR) estimators for estimating the mean of the outcome. An estimator is DR if it is consistent when either the regression model for the mean function or the propensity to respond is correctly specified. Performance of different methods is compared in terms of root mean squared error of the estimates and width and coverage of confidence intervals or posterior credibility intervals in repeated samples. Overall, the DR methods tended to yield better inference than the incorrect model when either the propensity or mean model is correctly specified, but were less successful for small sample sizes, where the asymptotic DR property is less consequential. Two methods tended to outperform the other DR methods: penalized spline of propensity prediction [Little RJA, An H. Robust likelihood-based analysis of multivariate data with missing values. Statist Sinica. 2004;14:949–968] and the robust method proposed in [Cao W, Tsiatis AA, Davidian M. Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data. Biometrika. 2009;96:723–734]. 相似文献

3.

Multiply robust matching estimators of average and quantile treatment effects

Shu Yang Yunshu Zhang 《Scandinavian Journal of Statistics》2023,50(1):235-265

Propensity score matching has been a long-standing tradition for handling confounding in causal inference, however, requiring stringent model assumptions. In this article, we propose novel double score matching (DSM) utilizing both the propensity score and prognostic score. To gain the protection of possible model misspecification, we posit multiple candidate models for each score. We show that the debiasing DSM estimator achieves the multiple robustness property in that it is consistent if any one of the score models is correctly specified. We characterize the asymptotic distribution for the DSM estimator requiring only one correct model specification based on the martingale representations of the matching estimators and theory for local normal experiments. We also provide a two-stage replication method for variance estimation and extend DSM for quantile estimation. Simulation demonstrates DSM outperforms single-score matching and prevailing multiply robust weighting estimators in the presence of extreme propensity scores. 相似文献

4.

General purpose multiply robust data integration procedures for handling nonprobability samples

Sixia Chen David Haziza 《Scandinavian Journal of Statistics》2023,50(2):697-724

In recent years, there has been an increased interest in combining probability and nonprobability samples. Nonprobability sample are cheaper and quicker to conduct but the resulting estimators are vulnerable to bias as the participation probabilities are unknown. To adjust for the potential bias, estimation procedures based on parametric or nonparametric models have been discussed in the literature. However, the validity of the resulting estimators relies heavily on the validity of the underlying models. Also, nonparametric approaches may suffer from the curse of dimensionality and poor efficiency. We propose a data integration approach by combining multiple outcome regression models and propensity score models. The proposed approach can be used for estimating general parameters including totals, means, distribution functions, and percentiles. The resulting estimators are multiply robust in the sense that they remain consistent if all but one model are misspecified. The asymptotic properties of point and variance estimators are established. The results from a simulation study show the benefits of the proposed method in terms of bias and efficiency. Finally, we apply the proposed method using data from the Korea National Health and Nutrition Examination Survey and data from the National Health Insurance Sharing Services. 相似文献

5.

Combining Inverse Probability Weighting and Multiple Imputation to Improve Robustness of Estimation

下载免费PDF全文

Peisong Han 《Scandinavian Journal of Statistics》2016,43(1):246-260

Inverse probability weighting (IPW) and multiple imputation are two widely adopted approaches dealing with missing data. The former models the selection probability, and the latter models data distribution. Consistent estimation requires correct specification of corresponding models. Although the augmented IPW method provides an extra layer of protection on consistency, it is usually not sufficient in practice as the true data‐generating process is unknown. This paper proposes a method combining the two approaches in the same spirit of calibration in sampling survey literature. Multiple models for both the selection probability and data distribution can be simultaneously accounted for, and the resulting estimator is consistent if any model is correctly specified. The proposed method is within the framework of estimating equations and is general enough to cover regression analysis with missing outcomes and/or missing covariates. Results on both theoretical and numerical investigation are provided. 相似文献

6.

Locally Efficient Semiparametric Estimators for Proportional Hazards Models with Measurement Error

Yuhang Xu Yehua Li Xiao Song 《Scandinavian Journal of Statistics》2016,43(2):558-572

We propose a new class of semiparametric estimators for proportional hazards models in the presence of measurement error in the covariates, where the baseline hazard function, the hazard function for the censoring time, and the distribution of the true covariates are considered as unknown infinite dimensional parameters. We estimate the model components by solving estimating equations based on the semiparametric efficient scores under a sequence of restricted models where the logarithm of the hazard functions are approximated by reduced rank regression splines. The proposed estimators are locally efficient in the sense that the estimators are semiparametrically efficient if the distribution of the error‐prone covariates is specified correctly and are still consistent and asymptotically normal if the distribution is misspecified. Our simulation studies show that the proposed estimators have smaller biases and variances than competing methods. We further illustrate the new method with a real application in an HIV clinical trial. 相似文献

7.

Double robust estimation in longitudinal marginal structural models

《Journal of statistical planning and inference》2006,136(3):1061-1089

In this article we consider estimation of causal parameters in a marginal structural model for the discrete intensity of the treatment specific counting process (e.g. hazard of a treatment specific survival time) based on longitudinal observational data on treatment, covariates and survival. We define three estimators: the inverse probability of treatment weighted (IPTW) estimator, the maximum likelihood estimator (MLE), and a double robust (DR) estimator. The DR estimator is obtained by following a general methodology for constructing double robust estimating functions in censored data models as described in van der Laan and Robins (Unified Methods for Censored Longitudinal Data and Causality, 2002). The double-robust estimator is consistent and asymptotically linear when either the treatment mechanism or the partial likelihood of the observed data is consistently estimated. We illustrate the superiority of the DR estimator relative to the IPTW and ML estimators in a simulation study. The proposed methodology is also applied to estimate the causal effect of exercise on physical functioning in a longitudinal study of seniors in Sonoma County. 相似文献

8.

A simple,doubly robust,efficient estimator for survival functions using pseudo observations

下载免费PDF全文

Jixian Wang 《Pharmaceutical statistics》2018,17(1):38-48

Survival functions are often estimated by nonparametric estimators such as the Kaplan‐Meier estimator. For valid estimation, proper adjustment for confounding factors is needed when treatment assignment may depend on confounding factors. Inverse probability weighting is a commonly used approach, especially when there is a large number of potential confounders to adjust for. Direct adjustment may also be used if the relationship between the time‐to‐event and all confounders can be modeled. However, either approach requires a correctly specified model for the relationship between confounders and treatment allocation or between confounders and the time‐to‐event. We propose a pseudo‐observation–based doubly robust estimator, which is valid when either the treatment allocation model or the time‐to‐event model is correctly specified and is generally more efficient than the inverse probability weighting approach. The approach can be easily implemented using standard software. A simulation study was conducted to evaluate this approach under a number of scenarios, and the results are presented and discussed. The results confirm robustness and efficiency of the proposed approach. A real data example is also provided for illustration. 相似文献

9.

Improved methods for moment restriction models with data combination and an application to two-sample instrumental variable estimation

Heng Shu Zhiqiang Tan 《Revue canadienne de statistique》2020,48(2):259-284

Combining-100 information from multiple samples is often needed in biomedical and economic studies, but differences between these samples must be appropriately taken into account in the analysis of the combined data. We study the estimation for moment restriction models with data combined from two samples under an ignorability-type assumption while allowing for different marginal distributions of variables common to both samples. Suppose that an outcome regression (OR) model and a propensity score (PS) model are specified. By leveraging semi-parametric efficiency theory, we derive an augmented inverse probability-weighted (AIPW) estimator that is locally efficient and doubly robust with respect to these models. Furthermore, we develop calibrated regression and likelihood estimators that are not only locally efficient and doubly robust but also intrinsically efficient in achieving smaller variances than the AIPW estimator when the PS model is correctly specified but the OR model may be mispecified. As an important application, we study the two-sample instrumental variable problem and derive the corresponding estimators while allowing for incompatible distributions of variables common to the two samples. Finally, we provide a simulation study and an econometric application on public housing projects to demonstrate the superior performance of our improved estimators. The Canadian Journal of Statistics 48: 259–284; 2020 © 2019 Statistical Society of Canada 相似文献

10.

New Robust Variable Selection Methods for Linear Regression Models

Ziqi Chen Man‐Lai Tang Wei Gao Ning‐Zhong Shi 《Scandinavian Journal of Statistics》2014,41(3):725-741

Motivated by an entropy inequality, we propose for the first time a penalized profile likelihood method for simultaneously selecting significant variables and estimating unknown coefficients in multiple linear regression models in this article. The new method is robust to outliers or errors with heavy tails and works well even for error with infinite variance. Our proposed approach outperforms the adaptive lasso in both theory and practice. It is observed from the simulation studies that (i) the new approach possesses higher probability of correctly selecting the exact model than the least absolute deviation lasso and the adaptively penalized composite quantile regression approach and (ii) exact model selection via our proposed approach is robust regardless of the error distribution. An application to a real dataset is also provided. 相似文献

11.

Second-order least squares estimation of censored regression models

Taraneh Abarin Liqun Wang 《Journal of statistical planning and inference》2009

This paper proposes the second-order least squares estimation, which is an extension of the ordinary least squares method, for censored regression models where the error term has a general parametric distribution (not necessarily normal). The strong consistency and asymptotic normality of the estimator are derived under fairly general regularity conditions. We also propose a computationally simpler estimator which is consistent and asymptotically normal under the same regularity conditions. Finite sample behavior of the proposed estimators under both correctly and misspecified models are investigated through Monte Carlo simulations. The simulation results show that the proposed estimator using optimal weighting matrix performs very similar to the maximum likelihood estimator, and the estimator with the identity weight is more robust against the misspecification. 相似文献

12.

Statistical inference for a semiparametric measurement error regression model with heteroscedastic errors

Haibo Zhou Jinhong You 《Journal of statistical planning and inference》2007

Efficient inference for regression models requires that the heteroscedasticity be taken into account. We consider statistical inference under heteroscedasticity in a semiparametric measurement error regression model, in which some covariates are measured with errors. This paper has multiple components. First, we propose a new method for testing the heteroscedasticity. The advantages of the proposed method over the existing ones are that it does not need any nonparametric estimation and does not involve any mismeasured variables. Second, we propose a new two-step estimator for the error variances if there is heteroscedasticity. Finally, we propose a weighted estimating equation-based estimator (WEEBE) for the regression coefficients and establish its asymptotic properties. Compared with existing estimators, the proposed WEEBE is asymptotically more efficient, avoids undersmoothing the regressor functions and requires less restrictions on the observed regressors. Simulation studies show that the proposed test procedure and estimators have nice finite sample performance. A real data set is used to illustrate the utility of our proposed methods. 相似文献

13.

Conditional mix-GEE models for longitudinal data with unspecified random-effects distributions

Yanchun Xing Lili Xu Zhichuan Zhu 《统计学通讯:理论与方法》2018,47(4):862-876

In the longitudinal studies, the mixture generalized estimation equation (mix-GEE) was proposed to improve the efficiency of the fixed-effects estimator for addressing the working correlation structure misspecification. When the subject-specific effect is one of interests, mixed-effects models were widely used to analyze longitudinal data. However, most of the existing approaches assume a normal distribution for the random effects, and this could affect the efficiency of the fixed-effects estimator. In this article, a conditional mixture generalized estimating equation (cmix-GEE) approach based on the advantage of mix-GEE and conditional quadratic inference function (CQIF) method is developed. The advantage of our new approach is that it does not require the normality assumption for random effects and can accommodate the serial correlation between observations within the same cluster. The feature of our proposed approach is that the estimators of the regression parameters are more efficient than CQIF even if the working correlation structure is not correctly specified. In addition, according to the estimates of some mixture proportions, the true working correlation matrix can be identified. We establish the asymptotic results for the fixed-effects parameter estimators. Simulation studies were conducted to evaluate our proposed method. 相似文献

14.

Two-Stage Bounded-lnfluence Estimators for Simultaneous-Equations Models

William S. Krasker 《商业与经济统计学杂志》2013,31(4):437-444

This article presents a class of estimators for linear structural models that are robust to heavytailed disturbance distributions, gross errors in either the endogenous or exogenous variables, and certain other model failures. The class of estimators modifies ordinary two-stage least squares by replacing each least squares regression by a bounded-influence regression. Conditions under which the estimators are qualitatively robust, consistent, and asymptotically normal are established, and an empirical example is presented. 相似文献

15.

A comparison of non-homogeneous Markov regression models with application to Alzheimer's disease progression

Hubbard RA Zhou XH 《Journal of applied statistics》2011,38(10):2313-2326

Markov regression models are useful tools for estimating the impact of risk factors on rates of transition between multiple disease states. Alzheimer's disease (AD) is an example of a multi-state disease process in which great interest lies in identifying risk factors for transition. In this context, non-homogeneous models are required because transition rates change as subjects age. In this report we propose a non-homogeneous Markov regression model that allows for reversible and recurrent disease states, transitions among multiple states between observations, and unequally spaced observation times. We conducted simulation studies to demonstrate performance of estimators for covariate effects from this model and compare performance with alternative models when the underlying non-homogeneous process was correctly specified and under model misspecification. In simulation studies, we found that covariate effects were biased if non-homogeneity of the disease process was not accounted for. However, estimates from non-homogeneous models were robust to misspecification of the form of the non-homogeneity. We used our model to estimate risk factors for transition to mild cognitive impairment (MCI) and AD in a longitudinal study of subjects included in the National Alzheimer's Coordinating Center's Uniform Data Set. Using our model, we found that subjects with MCI affecting multiple cognitive domains were significantly less likely to revert to normal cognition. 相似文献

16.

Efficient estimation in partially linear single‐index models for longitudinal data

Quan Cai Suojin Wang 《Scandinavian Journal of Statistics》2019,46(1):116-141

In this paper, we consider the estimation of both the parameters and the nonparametric link function in partially linear single‐index models for longitudinal data that may be unbalanced. In particular, a new three‐stage approach is proposed to estimate the nonparametric link function using marginal kernel regression and the parametric components with generalized estimating equations. The resulting estimators properly account for the within‐subject correlation. We show that the parameter estimators are asymptotically semiparametrically efficient. We also show that the asymptotic variance of the link function estimator is minimized when the working error covariance matrices are correctly specified. The new estimators are more efficient than estimators in the existing literature. These asymptotic results are obtained without assuming normality. The finite‐sample performance of the proposed method is demonstrated by simulation studies. In addition, two real‐data examples are analyzed to illustrate the methodology. 相似文献

17.

Robust estimation of complicated profiles using wavelets

Hamid Shahriari 《统计学通讯:理论与方法》2017,46(4):1573-1593

Some quality characteristics are well defined when treated as the response variables and their relationships are identified to some independent variables. This relationship is called a profile. The parametric models, such as linear models, may be used to model the profiles. However, due to the complexity of many processes in practical applications, it is inappropriate to model the process using parametric models. In these cases non parametric methods are used to model the processes. One of the most applicable non parametric methods used to model complicated profiles is the wavelet. Many authors considered the use of the wavelet transformation only for monitoring the processes in phase II. The problem of estimating the in-control profile in phase I using wavelet transformation is not deeply addressed. Usually classical estimators are used in phase I to estimate the in-control profiles, even when the wavelet transformation is used. These estimators are suitable if the data do not contain outliers. However, when the outliers exist, these estimators cannot estimate the in-control profile properly. In this research, a robust method of estimating the in-control profiles is proposed, which is insensitive to the presence of outliers and could be applied when the wavelet transformation is used. The proposed estimator is the combination of the robust clustering and the S-estimator. This estimator is compared with the classical estimator of the in-control profile in the presence of outliers. The results from a large simulation study show that using the proposed method, one can estimate the in-control profile precisely when the data are contaminated either locally or globally. 相似文献

18.

Consistent estimators of the variance-covariance matrix of the gmanova model with missing data

Robert D. Mensah R.K. Elswick Jr. Vernon M. Chinchilli 《统计学通讯:理论与方法》2013,42(6):1495-1514

A common problem in multivariate general linear models is partially missing response data. The simplest method of analysis in the presence of missing data has been to delete all observations on any individual with any missing data(listwise deletion) and utilize a traditional complete data approach. However: this can result in a great loss of information: and perhaps inconsistencies in the estimation of the variance-covariance matrix. In the generalized multivariate analysis of variance(GMANOVA) model with missing data: Kleinbaum(1973) proposed an estimated generalized least squares approach. In order to apply this: however: a consistent estimate of the variance-covariance matrix is needed. Kleinbaum proposed an estimator which is unbiased and consistent: but it does not take advantage of the fact that the underlying model is GMANOVA and not MANOVA. Using the fact that the underlying model is GMANOVA we have constructed four other con¬sistent estimators. A Monte Carlo simulation experiment is conducted tto further examine how well these estimators compare to the estimator proposed by Kleinbaum. 相似文献

19.

A comparative study of doubly robust estimators of the mean with missing data

《Journal of Statistical Computation and Simulation》2012,82(12):2039-2058

Doubly robust (DR) estimators of the mean with missing data are compared. An estimator is DR if either the regression of the missing variable on the observed variables or the missing data mechanism is correctly specified. One method is to include the inverse of the propensity score as a linear term in the imputation model [D. Firth and K.E. Bennett, Robust models in probability sampling, J. R. Statist. Soc. Ser. B. 60 (1998), pp. 3–21; D.O. Scharfstein, A. Rotnitzky, and J.M. Robins, Adjusting for nonignorable drop-out using semiparametric nonresponse models (with discussion), J. Am. Statist. Assoc. 94 (1999), pp. 1096–1146; H. Bang and J.M. Robins, Doubly robust estimation in missing data and causal inference models, Biometrics 61 (2005), pp. 962–972]. Another method is to calibrate the predictions from a parametric model by adding a mean of the weighted residuals [J.M Robins, A. Rotnitzky, and L.P. Zhao, Estimation of regression coefficients when some regressors are not always observed, J. Am. Statist. Assoc. 89 (1994), pp. 846–866; D.O. Scharfstein, A. Rotnitzky, and J.M. Robins, Adjusting for nonignorable drop-out using semiparametric nonresponse models (with discussion), J. Am. Statist. Assoc. 94 (1999), pp. 1096–1146]. The penalized spline propensity prediction (PSPP) model includes the propensity score into the model non-parametrically [R.J.A. Little and H. An, Robust likelihood-based analysis of multivariate data with missing values, Statist. Sin. 14 (2004), pp. 949–968; G. Zhang and R.J. Little, Extensions of the penalized spline propensity prediction method of imputation, Biometrics, 65(3) (2008), pp. 911–918]. All these methods have consistency properties under misspecification of regression models, but their comparative efficiency and confidence coverage in finite samples have received little attention. In this paper, we compare the root mean square error (RMSE), width of confidence interval and non-coverage rate of these methods under various mean and response propensity functions. We study the effects of sample size and robustness to model misspecification. The PSPP method yields estimates with smaller RMSE and width of confidence interval compared with other methods under most situations. It also yields estimates with confidence coverage close to the 95% nominal level, provided the sample size is not too small. 相似文献

20.

Comparison of causal effect estimators under exposure misclassification

Manoochehr Babanezhad Stijn Vansteelandt Els Goetghebeur 《Journal of statistical planning and inference》2010

Over the past decades, various principles for causal effect estimation have been proposed, all differing in terms of how they adjust for measured confounders: either via traditional regression adjustment, by adjusting for the expected exposure given those confounders (e.g., the propensity score), or by inversely weighting each subject's data by the likelihood of the observed exposure, given those confounders. When the exposure is measured with error, this raises the question whether these different estimation strategies might be differently affected and whether one of them is to be preferred for that reason. In this article, we investigate this by comparing inverse probability of treatment weighted (IPTW) estimators and doubly robust estimators for the exposure effect in linear marginal structural mean models (MSM) with G-estimators, propensity score (PS) adjusted estimators and ordinary least squares (OLS) estimators for the exposure effect in linear regression models. We find analytically that these estimators are equally affected when exposure misclassification is independent of the confounders, but not otherwise. Simulation studies reveal similar results for time-varying exposures and when the model of interest includes a logistic link. 相似文献