首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Cluster detection and clustering with random start forward searches
Authors:Anthony C Atkinson  Marco Riani  Andrea Cerioli
Institution:1. Department of Statistics, The London School of Economics, London, UK;2. Dipartimento di Scienze Economiche e Aziendale, Università di Parma, Parma, Italy
Abstract:The forward search is a method of robust data analysis in which outlier free subsets of the data of increasing size are used in model fitting; the data are then ordered by closeness to the model. Here the forward search, with many random starts, is used to cluster multivariate data. These random starts lead to the diagnostic identification of tentative clusters. Application of the forward search to the proposed individual clusters leads to the establishment of cluster membership through the identification of non-cluster members as outlying. The method requires no prior information on the number of clusters and does not seek to classify all observations. These properties are illustrated by the analysis of 200 six-dimensional observations on Swiss banknotes. The importance of linked plots and brushing in elucidating data structures is illustrated. We also provide an automatic method for determining cluster centres and compare the behaviour of our method with model-based clustering. In a simulated example with eight clusters our method provides more stable and accurate solutions than model-based clustering. We consider the computational requirements of both procedures.
Keywords:Brushing  data structure  forward search  graphical methods  linked plots  Mahalanobis distance  MM estimation  outliers  S estimation  Tukey's biweight
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号