首页 | 本学科首页   官方微博 | 高级检索  
     


Finite-sample analysis of impacts of unlabeled data and their labeling mechanisms in linear discriminant analysis
Authors:Kenichi Hayashi  Keiji Takai
Affiliation:1. Graduate School of Medicine, Osaka University, Osaka, Japan;2. Faculty of Commerce, Kansai University, Osaka, Japan
Abstract:It is widely believed that unlabeled data are promising for improving prediction accuracy in classification problems. Although theoretical studies about when/how unlabeled data are beneficial exist, an actual prediction improvement has not been sufficiently investigated for a finite sample in a systematic manner. We investigate the impact of unlabeled data in linear discriminant analysis and compare the error rates of the classifiers estimated with/without unlabeled data. Our focus is a labeling mechanism that characterizes the probabilistic structure of occurrence of labeled cases. Results imply that an extremely small proportion of unlabeled data has a large effect on the analysis results.
Keywords:Classification error  Missing data  Monte Carlo simulation  Relative efficiency  Semi-supervised learning
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号