首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Bayesian curve fitting and clustering with Dirichlet process mixture models for microarray data
Authors:Ju-Hyun Park  Minjung Kyung
Institution:1. Department of Statistics, Dongguk University, Seoul 04620, South Korea;2. Department of Statistics, Duksung Women’s University, Seoul 01369, South Korea
Abstract:In the field of molecular biology, it is often of interest to analyze microarray data for clustering genes based on similar profiles of gene expression to identify genes that are differentially expressed under multiple biological conditions. One of the notable characteristics of a gene expression profile is that it shows a cyclic curve over a course of time. To group sequences of similar molecular functions, we propose a Bayesian Dirichlet process mixture of linear regression models with a Fourier series for the regression coefficients, for each of which a spike and slab prior is assumed. A full Gibbs-sampling algorithm is developed for an efficient Markov chain Monte Carlo (MCMC) posterior computation. Due to the so-called “label-switching” problem and different numbers of clusters during the MCMC computation, a post-process approach of Fritsch and Ickstadt (2009) is additionally applied to MCMC samples for an optimal single clustering estimate by maximizing the posterior expected adjusted Rand index with the posterior probabilities of two observations being clustered together. The proposed method is illustrated with two simulated data and one real data of the physiological response of fibroblasts to serum of Iyer et al. (1999).
Keywords:Corresponding author    primary  62G08  secondary  62P10  Temporal cyclic gene expression profiles  Dirichlet process mixture  Fourier series  Variable selection  Label-switching  Adjusted Rand index
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号