On kernel machine learning for propensity score estimation under complex confounding structures期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

On kernel machine learning for propensity score estimation under complex confounding structures

Authors:	Baiming Zou Xinlei Mi Patrick J Tighe Gary G Koch Fei Zou

Institution:	1. Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA;2. Department of Preventive Medicine – Biostatistics, Quantitative Data Sciences Core (QDSC), Northwestern University, Chicago, IL, USA;3. Department of Anesthesiology, University of Florida, Gainesville, Florida, USA

Abstract:	Post marketing data offer rich information and cost-effective resources for physicians and policy-makers to address some critical scientific questions in clinical practice. However, the complex confounding structures (e.g., nonlinear and nonadditive interactions) embedded in these observational data often pose major analytical challenges for proper analysis to draw valid conclusions. Furthermore, often made available as electronic health records (EHRs), these data are usually massive with hundreds of thousands observational records, which introduce additional computational challenges. In this paper, for comparative effectiveness analysis, we propose a statistically robust yet computationally efficient propensity score (PS) approach to adjust for the complex confounding structures. Specifically, we propose a kernel-based machine learning method for flexibly and robustly PS modeling to obtain valid PS estimation from observational data with complex confounding structures. The estimated propensity score is then used in the second stage analysis to obtain the consistent average treatment effect estimate. An empirical variance estimator based on the bootstrap is adopted. A split-and-merge algorithm is further developed to reduce the computational workload of the proposed method for big data, and to obtain a valid variance estimator of the average treatment effect estimate as a by-product. As shown by extensive numerical studies and an application to postoperative pain EHR data comparative effectiveness analysis, the proposed approach consistently outperforms other competing methods, demonstrating its practical utility.

Keywords:	electronic health record inverse probability weighting kernel machine learning model selection

设为首页 | 免责声明 | 关于勤云 | 加入收藏