Exact and Monte Carlo calculations of integrated likelihoods for the latent class model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Exact and Monte Carlo calculations of integrated likelihoods for the latent class model

Authors:	C Biernacki G Celeux G Govaert

Institution:	1. CNRS & Université de Lille 1, Villeneuve d’Ascq, France;2. INRIA, Orsay, France;3. CNRS & Université de Technologie de Compiègne, Compiègne, France

Abstract:	The latent class model or multivariate multinomial mixture is a powerful approach for clustering categorical data. It uses a conditional independence assumption given the latent class to which a statistical unit is belonging. In this paper, we exploit the fact that a fully Bayesian analysis with Jeffreys non-informative prior distributions does not involve technical difficulty to propose an exact expression of the integrated complete-data likelihood, which is known as being a meaningful model selection criterion in a clustering perspective. Similarly, a Monte Carlo approximation of the integrated observed-data likelihood can be obtained in two steps: an exact integration over the parameters is followed by an approximation of the sum over all possible partitions through an importance sampling strategy. Then, the exact and the approximate criteria experimentally compete, respectively, with their standard asymptotic BIC approximations for choosing the number of mixture components. Numerical experiments on simulated data and a biological example highlight that asymptotic criteria are usually dramatically more conservative than the non-asymptotic presented criteria, not only for moderate sample sizes as expected but also for quite large sample sizes. This research highlights that asymptotic standard criteria could often fail to select some interesting structures present in the data.

Keywords:	Categorical data Bayesian model selection Jeffreys conjugate prior Importance sampling EM algorithm Gibbs sampler
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏