Clustering time-course microarray data using functional Bayesian infinite mixture model |
| |
Authors: | Claudia Angelini Marianna Pensky |
| |
Affiliation: | 1. Istituto per le Applicazioni del Calcolo , “Mauro Picone”, CNR , Italy;2. Department of Mathematics , University of Central Florida , Orlando, USA |
| |
Abstract: | This paper presents a new Bayesian, infinite mixture model based, clustering approach, specifically designed for time-course microarray data. The problem is to group together genes which have “similar” expression profiles, given the set of noisy measurements of their expression levels over a specific time interval. In order to capture temporal variations of each curve, a non-parametric regression approach is used. Each expression profile is expanded over a set of basis functions and the sets of coefficients of each curve are subsequently modeled through a Bayesian infinite mixture of Gaussian distributions. Therefore, the task of finding clusters of genes with similar expression profiles is then reduced to the problem of grouping together genes whose coefficients are sampled from the same distribution in the mixture. Dirichlet processes prior is naturally employed in such kinds of models, since it allows one to deal automatically with the uncertainty about the number of clusters. The posterior inference is carried out by a split and merge MCMC sampling scheme which integrates out parameters of the component distributions and updates only the latent vector of the cluster membership. The final configuration is obtained via the maximum a posteriori estimator. The performance of the method is studied using synthetic and real microarray data and is compared with the performances of competitive techniques. |
| |
Keywords: | mixture models Dirichlet processes MCMC time-course microarray |
|
|