Abstract: | In most commercial applications of k-means clustering, researchers choose one set of kseed points to start the partitioning process; often, the initial set of seeds is chosen randomly. Using Monte Carlo simulation, we show that significant benefits are associated with replicated starting configurations that incorporate seed selection procedures based on a hierarchical clustering of sample points drawn from the original data matrix. A real-world application of the approach is then presented. |