Model selection for mixture‐based clustering for ordinal data |
| |
Authors: | D Fernández R Arnold |
| |
Institution: | 1. Department of Epidemiology & Biostatistics, School of 2. Public Health, University at Albany: State University of New York, Rensselaer, NY, USA;3. School of Mathematics and Statistics, Victoria University of Wellington, Wellington, New Zealand |
| |
Abstract: | One of the key questions in the use of mixture models concerns the choice of the number of components most suitable for a given data set. In this paper we investigate answers to this problem in the context of likelihood‐based clustering of the rows of a matrix of ordinal data modelled by the ordered stereotype model. Two methodologies for selecting the best model are demonstrated and compared. The first approach fits a separate model to the data for each possible number of clusters, and then uses an information criterion to select the best model. The second approach uses a Bayesian construction in which the parameters and the number of clusters are estimated simultaneously from their joint posterior distribution. Simulation studies are presented which include a variety of scenarios in order to test the reliability of both approaches. Finally, the results of the application of model selection to two real data sets are shown. |
| |
Keywords: | finite mixture model information criteria RJMCMC sampler stereotype model |
|
|