A fast splitting procedure for classification trees |
| |
Authors: | MOLA FRANCESCO SICILIANO ROBERTA |
| |
Affiliation: | (1) Dipartimento di Matematica e Statistica, Universita` degli Studi di Napoli Federico II, Monte S. Angelo, via Cintia, 80126 Napoli, Italy |
| |
Abstract: | This paper provides a faster method to find the best split at each node when using the CART methodology. The predictability index is proposed as a splitting rule for growing the same classification tree as CART does when using the Gini index of heterogeneity as an impurity measure. A theorem is introduced to show a new property of the index : the for a given predictor has a value not lower than the for any split generated by the predictor. This property is used to make a substantial saving in the time required to generate a classification tree. Three simulation studies are presented in order to show the computational gain in terms of both the number of splits analysed at each node and the CPU time. The proposed splitting algorithm can prove computational efficiency in real data sets as shown in an example. |
| |
Keywords: | CART predictability index /content/p8585p45k5118018/xxlarge964.gif" alt=" tau" align=" BASELINE" BORDER=" 0" > Gini index of heterogeneity maximal binary tree |
本文献已被 SpringerLink 等数据库收录! |