Clustering confidence sets |
| |
Authors: | Nicoleta Serban |
| |
Affiliation: | Industrial Systems and Engineering School, Georgia Institute of Technology, Atlanta, GA, USA |
| |
Abstract: | In this article, we present a novel approach to clustering finite or infinite dimensional objects observed with different uncertainty levels. The novelty lies in using confidence sets rather than point estimates to obtain cluster membership and the number of clusters based on the distance between the confidence set estimates. The minimal and maximal distances between the confidence set estimates provide confidence intervals for the true distances between objects. The upper bounds of these confidence intervals can be used to minimize the within clustering variability and the lower bounds can be used to maximize the between clustering variability. We assign objects to the same cluster based on a min–max criterion and we separate clusters based on a max–min criterion. We illustrate our technique by clustering a large number of curves and evaluate our clustering procedure with a synthetic example and with a specific application. |
| |
Keywords: | Single-linkage tree Distance confidence interval Gap sequence Clustering error rate Simultaneous confidence sets Compustat Global database Q-ratio |
本文献已被 ScienceDirect 等数据库收录! |