Exhaustive k-nearest-neighbour subspace clustering |
| |
Authors: | Johann M Kraus Ludwig Lausser |
| |
Institution: | Medical Systems Biology, Ulm University, 89069 Ulm, Germany |
| |
Abstract: | Cluster analysis is an important technique of explorative data mining. It refers to a collection of statistical methods for learning the structure of data by solely exploring pairwise distances or similarities. Often meaningful structures are not detectable in these high-dimensional feature spaces. Relevant features can be obfuscated by noise from irrelevant measurements. These observations led to the design of subspace clustering algorithms, which can identify clusters that originate from different subsets of features. Hunting for clusters in arbitrary subspaces is intractable due to the infinite search space spanned by all feature combinations. In this work, we present a subspace clustering algorithm that can be applied for exhaustively screening all feature combinations of small- or medium-sized datasets (approximately 30 features). Based on a robustness analysis via subsampling we are able to identify a set of stable candidate subspace cluster solutions. |
| |
Keywords: | subspace clustering exhaustive search k-NN clustering multi-objective optimization cluster number estimation cluster map |
|