Three Steps Strategy to Search for Optimum Classification Trees |
| |
Authors: | Muhammad Azam Muhammad Aslam Karl Peter Pfeiffer |
| |
Affiliation: | 1. Department of Statistics, Forman Christian College University, Lahore, Pakistan;2. Department of Statistics, Faculty of Sciences, King Abdulaziz University, Jeddah, Saudi Arabia;3. Department of Medical Statistics, Informatics and Health Economics, Innsbruck Medical University, Austria |
| |
Abstract: | This article presents a new strategy to construct classification trees. According to the proposed scheme, we focused on keeping the record of sequences of each constructed classification tree; both in terms of splitting predictors and their splitting values in an array. So overall we have as many arrays as we have drawn samples. At this stage, a three steps strategy is introduced, which is used to search for the optimum classification tree. The proposed strategy provides comparable or improved results in terms of generalized error rates than tree and rpart (packages available for classification purposes in the R) using four of the well-known evaluation functions, that is, the Gini, the Entropy, the Twoing, and the Exponent-based function to split nodes for many real-life datasets. |
| |
Keywords: | Evaluation functions Optimum classification trees Three steps strategy Value-wise tree sequence Variable-wise tree sequence. |
|
|