首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Grouped Variable Selection Using Area under the ROC with Imbalanced Data
Authors:Yang Li  Yichen Qin  Limin Wang  Jiaxu Chen  Shuangge Ma
Institution:1. Center for Applied Statistics, Renmin University of China, Beijing, P.R. China;2. School of Statistics, Renmin University of China, Beijing, P.R. China;3. Department of Operations, Business Analytics and Information Systems, University of Cincinnati, Ohio, USA;4. School of Preclinical Medicine, Beijing University of Chinese Medicine, Beijing, P.R. China;5. School of Statistics, Renmin University of China, Beijing, P.R. China;6. Department of Biostatistics, Yale University, New Haven, Connecticut, USA
Abstract:Imbalanced data brings biased classification and causes the low accuracy of the classification of the minority class. In this article, we propose a methodology to select grouped variables using the area under the ROC with an adjustable prediction cut point. The proposed method enhance the accuracy of classification for the minority class by maximizing the true positive rate. Simulation results show that the proposed method is appropriate for both the categorical and continuous covariates. An illustrative example of the analysis of the SHS data in TCM is discussed to show the reasonable application of the proposed method.
Keywords:Area under ROC  Group lasso  Imbalanced data  True positive rate
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号