Publications
Sort:
Open Access Issue
Taiga: Performance Optimization of the C4.5 Decision Tree Construction Algorithm
Tsinghua Science and Technology 2016, 21 (4): 415-425
Published: 11 August 2016
Downloads:16

Classification is an important machine learning problem, and decision tree construction algorithms are an important class of solutions to this problem. RainForest is a scalable way to implement decision tree construction algorithms. It consists of several algorithms, of which the best one is a hybrid between a traditional recursive implementation and an iterative implementation which uses more memory but involves less write operations. We propose an optimized algorithm inspired by RainForest. By using a more sophisticated switching criterion between the two algorithms, we are able to get a performance gain even when all statistical information fits in memory. Evaluations show that our method can achieve a performance boost of 2.8 times in average than the traditional recursive implementation.

total 1