The cluster correlation-network support vector machine for high-dimensional binary classification |
| |
Authors: | Rachid Kharoubi Abdallah Mkhadri |
| |
Affiliation: | 1. Department of Mathematics, Université du Québec à Montréal, Montreal, QC, Canada;2. Department of Mathematics, Cadi Ayyad University, Marrakech, Morocco |
| |
Abstract: | ABSTRACTIdentifying homogeneous subsets of predictors in classification can be challenging in the presence of high-dimensional data with highly correlated variables. We propose a new method called cluster correlation-network support vector machine (CCNSVM) that simultaneously estimates clusters of predictors that are relevant for classification and coefficients of penalized SVM. The new CCN penalty is a function of the well-known Topological Overlap Matrix whose entries measure the strength of connectivity between predictors. CCNSVM implements an efficient algorithm that alternates between searching for predictors’ clusters and optimizing a penalized SVM loss function using Majorization–Minimization tricks and a coordinate descent algorithm. This combining of clustering and sparsity into a single procedure provides additional insights into the power of exploring dimension reduction structure in high-dimensional binary classification. Simulation studies are considered to compare the performance of our procedure to its competitors. A practical application of CCNSVM on DNA methylation data illustrates its good behaviour. |
| |
Keywords: | Support vector machine classification clustering variables selection shrinkage |
|
|