Identification of target clusters by using the restricted normal mixture model |
| |
Authors: | Seung-Gu Kim Yung-Seop Lee |
| |
Affiliation: | 1. Department of Data and Information , Sangji University , Wonju , KangWon , 220-702 , Korea;2. Department of Statistics , Dongguk University-Seoul , Seoul , Korea |
| |
Abstract: | ![]() This paper addresses the problem of identifying groups that satisfy the specific conditions for the means of feature variables. In this study, we refer to the identified groups as “target clusters” (TCs). To identify TCs, we propose a method based on the normal mixture model (NMM) restricted by a linear combination of means. We provide an expectation–maximization (EM) algorithm to fit the restricted NMM by using the maximum-likelihood method. The convergence property of the EM algorithm and a reasonable set of initial estimates are presented. We demonstrate the method's usefulness and validity through a simulation study and two well-known data sets. The proposed method provides several types of useful clusters, which would be difficult to achieve with conventional clustering or exploratory data analysis methods based on the ordinary NMM. A simple comparison with another target clustering approach shows that the proposed method is promising in the identification. |
| |
Keywords: | EM algorithm maximum-likelihood method mean restrictions microarray gene expression data restricted normal mixture model target clustering |
|
|