首页 | 本学科首页   官方微博 | 高级检索  
     

高维数据的稳健二分类方法
引用本文:史兴杰等. 高维数据的稳健二分类方法[J]. 统计研究, 2020, 37(9): 95-105. DOI: 10.19343/j.cnki.11-1302/c.2020.09.009
作者姓名:史兴杰等
摘    要:对于实证研究中经常遇到变量维数高和存在异常值的二分类问题,探索稳健的高维二分类方法显得尤为重要。本文提出基于Lasso惩罚的光滑0-1损失函数二分类法,并利用Fabs 算法高效地解决了变量选择和参数估计问题。数值模拟的结果表明,在不同异常值比例下该方法均具有良好的稳健性。基于CHIP 2013年度数据,利用该方法对农民工子女高中入学决定的影响因素进行了实证研究。分析发现,农民工父母的教育水平、教育水平与家庭经济状况的交互作用、农民工子女性别、性别与民族的交互作用均对农民工子女的入学决定有重要影响。

关 键 词:0-1损失  Fabs算法  变量选择  稳健二分类  

Robust Binary Classification of High-dimensional Data
Shi Xingjie et al. Robust Binary Classification of High-dimensional Data[J]. Statistical Research, 2020, 37(9): 95-105. DOI: 10.19343/j.cnki.11-1302/c.2020.09.009
Authors:Shi Xingjie et al
Abstract:In the empirical research, we usually face high-dimensional data with outliers, so robust binary classification methods are very important. In this paper, we propose a robust method, the Lasso regularized smooth 0-1 loss. Based on Fabs algorithm, we provide an efficient solution for variable selection and parameter estimation. The simulation result shows that the proposed method has robust performance in the presence of various proportions of outliers. An empirical study based on CHIP 2013 dataset demonstrates that many factors have an important effect on the high school enrollment of migrant workers’ children, namely parents’ education levels, students’ gender, and interaction effects between parents’ education levels and family financial conditions, and between students’ gender and ethnicity.
Keywords:0-1 Loss   Fabs Algorithm   Variable Selection   Robust Binary Classification  
点击此处可从《统计研究》浏览原始摘要信息
点击此处可从《统计研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号