Block diagrams and splitting criteria for classification trees |
| |
Authors: | Paul C. Taylor Bernard W. Silverman |
| |
Affiliation: | (1) Department of Applied Statistics, University of Reading, Earley Gate 3, Whiteknights Road, P.O. Box 240, RG6 2FN Reading, UK;(2) School of Mathematics, University of Bristol, University Walk, BS8 1TW Bristol, UK |
| |
Abstract: | Various aspects of the classification tree methodology of Breiman et al., (1984) are discussed. A method of displaying classification trees, called block diagrams, is developed. Block diagrams give a clear presentation of the classification, and are useful both to point out features of the particular data set under consideration and also to highlight deficiencies in the classification method being used. Various splitting criteria are discussed; the usual Gini-Simpson criterion presents difficulties when there is a relatively large number of classes and improved splitting criteria are obtained. One particular improvement is the introduction of adaptive anti-end-cut factors that take advantage of highly asymmetrical splits where appropriate. They use the number and mix of classes in the current node of the tree to identify whether or not it is likely to be advantageous to create a very small offspring node. A number of data sets are used as examples. |
| |
Keywords: | Classification tree methodology Gini-Simpson criterion anti-end-cut factors mean posterior improvement criterion |
本文献已被 SpringerLink 等数据库收录! |
|