Classification performance resulting from a 2-means |
| |
Authors: | C Ruwet G Haesbroeck |
| |
Institution: | University of Liège, Department of Mathematics, Liège, Belgium |
| |
Abstract: | The k-means procedure is probably one of the most common nonhierachical clustering techniques. From a theoretical point of view, it is related to the search for the k principal points of the underlying distribution. In this paper, the classification resulting from that procedure for k=2 is shown to be optimal under a balanced mixture of two spherically symmetric and homoscedastic distributions. Then, the classification efficiency of the 2-means rule is assessed using the second order influence function and compared to the classification efficiencies of Fisher and Logistic discriminations. Influence functions are also considered here to compare the robustness to infinitesimal contamination of the 2-means method w.r.t. the generalized 2-means technique. |
| |
Keywords: | Asymptotic loss Cluster analysis Error rate k-means Influence function Principal points Robustness |
本文献已被 ScienceDirect 等数据库收录! |
|