Significance test of clustering under high dimensional setting with applications to cancer data |
| |
Authors: | Ping Dong Yunquan Song |
| |
Affiliation: | 1. Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan, People's Republic of China;2. College of Science, China University of Petroleum, Qingdao, People's Republic of China |
| |
Abstract: | For high dimensional data, the SigClust is developed for testing the significance of clustering. The cluster index (CI) for SigClust is conducted by the ratio of the within-cluster and total sum of squares. But its empirical size is too conservative to be over controlled. By removing the cumbrous terms in the CI, an improved index (BCI) is proposed in this paper. The coefficient of variation of the BCI can be significantly reduced, implying that the new index BCI is stable. Moreover, the new significance test (NewSig) maintains the size, meanwhile, provides a greater power. Simulation experiments and two real cancer data examples are analysed for illustrating the performance of the new methodology. |
| |
Keywords: | High dimensionality cluster significance test p-value power empirical size cancer genes |
|
|