A comparison of regularization methods applied to the linear discriminant function with high-dimensional microarray data期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

A comparison of regularization methods applied to the linear discriminant function with high-dimensional microarray data

Authors:	John A Ramey Phil D Young

Institution:	1. Pacific Northwest National Laboratory , Applied Statistics and Computational Modeling , PO Box 999, MSIN K7-20, Richland , WA , 99352 , USA;2. Baylor University, Department of Statistical Science , One Bear Place, #97140, Waco , TX , 76798-7140 , USA john.ramey@pnnl.gov;4. Baylor University, Department of Statistical Science , One Bear Place, #97140, Waco , TX , 76798-7140 , USA

Abstract:	Classification of gene expression microarray data is important in the diagnosis of diseases such as cancer, but often the analysis of microarray data presents difficult challenges because the gene expression dimension is typically much larger than the sample size. Consequently, classification methods for microarray data often rely on regularization techniques to stabilize the classifier for improved classification performance. In particular, numerous regularization techniques, such as covariance-matrix regularization, are available, which, in practice, lead to a difficult choice of regularization methods. In this paper, we compare the classification performance of five covariance-matrix regularization methods applied to the linear discriminant function using two simulated high-dimensional data sets and five well-known, high-dimensional microarray data sets. In our simulation study, we found the minimum distance empirical Bayes method reported in Srivastava and Kubokawa Comparison of discrimination methods for high dimensional data, J. Japan Statist. Soc. 37(1) (2007), pp. 123–134], and the new linear discriminant analysis reported in Thomaz, Kitani, and Gillies A Maximum Uncertainty LDA-based approach for Limited Sample Size problems – with application to Face Recognition, J. Braz. Comput. Soc. 12(1) (2006), pp. 1–12], to perform consistently well and often outperform three other prominent regularization methods. Finally, we conclude with some recommendations for practitioners.

Keywords:	linear discriminant analysis covariance-matrix regularization high-dimensional supervised classification microarray data expected error rate

设为首页 | 免责声明 | 关于勤云 | 加入收藏