首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Identifying influential observations in hierarchical cluster analysis
Authors:I T Jolliffe  B Jones  B J T Morgan
Institution:1. University of Aberdeen;2. DeMontfort University;3. University of Kent
Abstract:In a cluster analysis of a multivariate data set, it may happen that one or two observations have a disproportionately large effect on the analysis, in the sense that their removal causes a dramatic change to the results. It is important to be able to identify such influential observations, and the present paper addresses this problem. To do so, we must first quantify the effect of a single observation. Various definitions are discussed, and criteria for identifying influential observations are investigated; the minimum spanning tree and the number of neighbours of each observation are considered. The investigation concentrates on single-link cluster analysis, although complete-link analysis is also briefly discussed. Patterns emerge in both real and simulated data, which suggest ways of predicting observations with no effect and those with the greatest effect. It is not necessary to recalculate the results with each observation omitted—an economy of presentation as well as labour.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号