Clustering for contingency tables: boxes and partitions |
| |
Authors: | Boris Mirkin |
| |
Affiliation: | (1) DIMACS, Rutgers University, P.O. Box 1179, 08855 Piscataway, NJ, USA;(2) Central Economics-Mathematics Institute, Moscow, Russia |
| |
Abstract: | The correspondence analysis (CA) method appears to be an effective tool for analysis of interrelations between rows and columns in two-way contingency data. A discrete version of the method, box clustering, is developed in the paper using an approximation version of the CA model extended to the case when CA factor values are required to be Boolean. Several properties of the proposed SEFIT-BOX algorithm are proved to facilitate interpretation of its output. It is also shown that two known partitioning algorithms (applied within row or column sets only) could be considered as locally optimal algorithms for fitting the model, and extensions of these algorithms to a simultaneous row and column partitioning problem are proposed. |
| |
Keywords: | Contingency data correspondence analysis clustering box clustering sequential fitting |
本文献已被 SpringerLink 等数据库收录! |
|