首页 | 本学科首页   官方微博 | 高级检索  
     检索      


On kernel machine learning for propensity score estimation under complex confounding structures
Authors:Baiming Zou  Xinlei Mi  Patrick J Tighe  Gary G Koch  Fei Zou
Institution:1. Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA;2. Department of Preventive Medicine – Biostatistics, Quantitative Data Sciences Core (QDSC), Northwestern University, Chicago, IL, USA;3. Department of Anesthesiology, University of Florida, Gainesville, Florida, USA
Abstract:Post marketing data offer rich information and cost-effective resources for physicians and policy-makers to address some critical scientific questions in clinical practice. However, the complex confounding structures (e.g., nonlinear and nonadditive interactions) embedded in these observational data often pose major analytical challenges for proper analysis to draw valid conclusions. Furthermore, often made available as electronic health records (EHRs), these data are usually massive with hundreds of thousands observational records, which introduce additional computational challenges. In this paper, for comparative effectiveness analysis, we propose a statistically robust yet computationally efficient propensity score (PS) approach to adjust for the complex confounding structures. Specifically, we propose a kernel-based machine learning method for flexibly and robustly PS modeling to obtain valid PS estimation from observational data with complex confounding structures. The estimated propensity score is then used in the second stage analysis to obtain the consistent average treatment effect estimate. An empirical variance estimator based on the bootstrap is adopted. A split-and-merge algorithm is further developed to reduce the computational workload of the proposed method for big data, and to obtain a valid variance estimator of the average treatment effect estimate as a by-product. As shown by extensive numerical studies and an application to postoperative pain EHR data comparative effectiveness analysis, the proposed approach consistently outperforms other competing methods, demonstrating its practical utility.
Keywords:electronic health record  inverse probability weighting  kernel machine learning  model selection
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号