Proximal support vector machine techniques on medical prediction outcome |
| |
Authors: | Krystallenia Drosou |
| |
Affiliation: | Department of Mathematics, National Technical University of Athens, Zografou, Athens, Greece |
| |
Abstract: | One of the major issues in medical field constitutes the correct diagnosis, including the limitation of human expertise in diagnosing the disease in a manual way. Nowadays, the use of machine learning classifiers, such as support vector machines (SVM), in medical diagnosis is increasing gradually. However, traditional classification algorithms can be limited in their performance when they are applied on highly imbalanced data sets, in which negative examples (i.e. negative to a disease) outnumber the positive examples (i.e. positive to a disease). SVM constitutes a significant improvement and its mathematical formulation allows the incorporation of different weights so as to deal with the problem of imbalanced data. In the present work an extensive study of four medical data sets is conducted using a variant of SVM, called proximal support vector machine (PSVM) proposed by Fung and Mangasarian [9 G.M. Fung and O.L. Mangasarian, Proximal support vector machine classifiers, in Proceedings KDD-2001: Knowledge Discovery and Data Mining, F. Provost and R. Srikant, eds., Association for Computing Machinery, San Francisco, CA, New York, 2001, pp. 77–86. Available at ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/01-02.ps. [Google Scholar]]. Additionally, in order to deal with the imbalanced nature of the medical data sets we applied both a variant of SVM, referred as two-cost support vector machine and a modification of PSVM referred as modified PSVM. Both algorithms incorporate different weights one for each class examples. |
| |
Keywords: | Medical data sets binary classification proximal support vector machines MPSVM imbalanced data |
|
|