Bayesian model averaging for estimating the number of classes: applications to the total number of species in metagenomics |
| |
Authors: | Sébastien Li-Thiao-Té Daudin Jean-Jacques Robin Stéphane |
| |
Affiliation: | 1. UMR 7539 Institut Galilée/Université Paris 13 , Villetaneuse , France;2. UMR 518 AgroParisTech/INRA , AgroParisTech, Paris , France |
| |
Abstract: | The species abundance distribution and the total number of species are fundamental descriptors of the biodiversity of an ecological community. This paper focuses on situations where large numbers of rare species are not observed in the data set due to insufficient sampling of the community, as is the case in metagenomics for the study of microbial diversity. We use a truncated mixture model for the observations to explicitly tackle the missing data and propose methods to estimate the total number of species and, in particular, a Bayesian credibility interval for this number. We focus on computationally efficient procedures with variational methods and importance sampling as opposed to Markov Chain Monte Carlo sampling, and we use Bayesian model averaging as the number of components of the mixture model is unknown. |
| |
Keywords: | mixture models Bayesian model averaging variational methods truncation metagenomics |
|
|