首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于Metacost的客户信用评估半监督异构集成模型研究
引用本文:鄢澜,李思涵,肖毅,寇宇轩,刘敦虎,肖进.基于Metacost的客户信用评估半监督异构集成模型研究[J].中国管理科学,2022,30(12):211-221.
作者姓名:鄢澜  李思涵  肖毅  寇宇轩  刘敦虎  肖进
作者单位:1.四川大学出国留学人员培训部,四川 成都610064; 2.四川大学商学院,四川 成都610064;3.华中师范大学信息管理学院,湖北 武汉430079; 4.澳门科技大学商学院,澳门特别行政区999078;5.成都信息工程大学管理学院, 四川 成都610225
基金项目:国家自然科学基金资助面上项目(72171160,71974020);四川省杰出青年基金资助项目(2020JDJQ0021);四川省“天府万人计划项目”(0082204151153);四川省软科学研究计划项目(2020JDR0120);四川大学国家领军人才培育基金资助项目(sksyl2021-03)
摘    要:针对现实中信用评估存在的问题,本研究将元代价敏感学习、半监督学习和异构集成等技术结合,提出了基于Metacost的客户信用评估半监督异构集成模型(Metacost based semi-supervised heterogeneous ensemble model, Meta-Semi-HE)。该模型主要包括三个阶段:1)用Metacost方法修改初始有标签训练集得到Lm;2)在Lm上通过AdaBoost方法训练N个异构分类器hi(i = 1,…, N),用伴随分类器组合Hi选择性标记无标签数据集的样本,并将其添加到Lm中,用新的Lm重新训练N个异构分类器。重复这一步骤,不断提高分类器性能,直至满足终止条件;3)用最终的N个异构分类器对测试集样本分类。在6个客户信用评估数据集上进行实证分析,结果表明,与已有的3种半监督集成模型和2种监督式集成模型相比,本研究提出的模型具有更好的客户信用评估性能。

关 键 词:客户信用评估  类别分布不平衡  代价敏感学习  半监督  异构集成  
收稿时间:2020-05-20
修稿时间:2020-07-03

Metacost Based Semi-supervised Heterogeneous Ensemble Model for Customer Credit Scoring
YAN Lan,LI Si-han,XIAO Yi,KOU Yu-xuan,LIU Dun-hu,XIAO Jin.Metacost Based Semi-supervised Heterogeneous Ensemble Model for Customer Credit Scoring[J].Chinese Journal of Management Science,2022,30(12):211-221.
Authors:YAN Lan  LI Si-han  XIAO Yi  KOU Yu-xuan  LIU Dun-hu  XIAO Jin
Institution:1. Intensive Language Training Center, Sichuan University, Chengdu 610064, China;2. Business School, Sichuan University, Chengdu 610064, China;3. School of Information Management, Central China Normal University, Wuhan 430079, China;4. School of Business, Macau University of Science and Technology, Macau 999078, China;5. School of Management, Chengdu University of Information Technology, Chengdu 610225, China
Abstract:With the popularization of the credit business, effective risk aversion is one of the main means to maintain stable profits in the financial industry, and credit risk is one of the most common and important risk types in the financial industry. Therefore, accurate credit scoring of customers is very important. However, the class distribution of customer data used for credit-scoring models is often highly imbalanced, which means that there are significantly more customers with good credit as compared to customers with bad credit, and only a few customers who have successfully obtained loans can be labeled according to their future behavior, many customers who have applied for loans but failed to obtain them cannot be labeled. These characteristics bring great challenges to the establishment of scientific and accurate customer credit-scoring models, and existing researches cannot solve the above problems well. To make up for the lack of existing researches, meta cost-sensitive learning, semi-supervised learning, and heterogeneous ensemble learning are combined, and a Metacost based semi-supervised heterogeneous ensemble model (Meta-Semi-HE) is proposed for customer credit scoring. This model includes the following three stages: 1) Metacost is used to modify the initial labeled training set to obtain Lm; 2) N heterogeneous classifiers hi(i=1,…, N) are trained on Lm by AdaBoost, concomitant ensemble Hi is used to selectively mark samples of unlabeled data set, and adds them into Lm, N heterogeneous classifiers are retrained with the new Lm. Repeat this step to improve the performance of the member classifiers until the termination condition is satisfied; 3) the final trained classifiers are used to classify samples of the test set. The empirical analysis is conducted in six customer credit-scoring datasets, and the results show that the Meta-Semi-HE has better customer credit-scoring performance than the other five models in the evaluation criteria of AUC, f, Type I accuracy, and Type II accuracy. A new way of thinking for banks’ customer credit-scoring modeling is provided, which helps banks to avoid risks more effectively and promotes the healthy and stable development of credit business in the financial industry.
Keywords:customer credit scoring  imbalanced class distribution  cost-sensitive learning  semi-supervised  heterogeneous ensemble  
点击此处可从《中国管理科学》浏览原始摘要信息
点击此处可从《中国管理科学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号