Regularization through variable selection and conditional MLE with application to classification in high dimensions |
| |
Authors: | Eitan Greenshtein Junyong Park Guy Lebanon |
| |
Affiliation: | 1. Department of Statistical Science, Duke University, NC 27708-0251;2. Department of Mathematics and Statistics, University of Maryland, Baltimore County, MD 21250, USA;3. College of Computing, Georgia Institute of Technology, GA 30332, USA |
| |
Abstract: | It is often the case that high-dimensional data consist of only a few informative components. Standard statistical modeling and estimation in such a situation is prone to inaccuracies due to overfitting, unless regularization methods are practiced. In the context of classification, we propose a class of regularization methods through shrinkage estimators. The shrinkage is based on variable selection coupled with conditional maximum likelihood. Using Stein's unbiased estimator of the risk, we derive an estimator for the optimal shrinkage method within a certain class. A comparison of the optimal shrinkage methods in a classification context, with the optimal shrinkage method when estimating a mean vector under a squared loss, is given. The latter problem is extensively studied, but it seems that the results of those studies are not completely relevant for classification. We demonstrate and examine our method on simulated data and compare it to feature annealed independence rule and Fisher's rule. |
| |
Keywords: | Classification High dimensions Conditional MLE Stein's unbiased estimator |
本文献已被 ScienceDirect 等数据库收录! |
|