首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Selection bias in working with the top genes in supervised classification of tissue samples
Authors:X Zhu  C Ambroise  GJ McLachlan  
Abstract:Currently there is much interest in using microarray gene-expression data to form prediction rules for the diagnosis of patient outcomes. A process of gene selection is usually carried out first to find those genes that are most useful according to some criterion for distinguishing between the given classes of tissue samples. However, there is a bias (selection bias) introduced in the estimate of the final version of a prediction rule that has been formed from a smaller subset of the genes that have been selected according to some optimality criterion. In this paper, we focus on the bias that arises when a full data set is not available in the first instance and the prediction rule is formed subsequently by working with the top-ranked genes from the full set. We demonstrate how large the subset of top genes must be before this selection bias is not of practical consequence.
Keywords:Gene selection  Support vector machine  Error rates  Cross-validation  Selection bias
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号