首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Estimating the number of clusters
Authors:Antonio Cuevas  Manuel Febrero  Ricardo Fraiman
Abstract:Hartigan (1975) defines the number q of clusters in a d ‐variate statistical population as the number of connected components of the set {f > c}, where f denotes the underlying density function on Rd and c is a given constant. Some usual cluster algorithms treat q as an input which must be given in advance. The authors propose a method for estimating this parameter which is based on the computation of the number of connected components of an estimate of {f > c}. This set estimator is constructed as a union of balls with centres at an appropriate subsample which is selected via a nonparametric density estimator of f. The asymptotic behaviour of the proposed method is analyzed. A simulation study and an example with real data are also included.
Keywords:Cluster analysis  density estimates  level sets  number of modes  smoothed bootstrap  support estimation
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号