Abstract:State-of-the-art machine learning solutions mainly focus on creating highly accurate models without constraints on hardware resources. Stream mining algorithms are designed to run on resource-constrained devices, thus a focus on low power and energy and memory-efficient is essential. The Hoeffding tree algorithm is able to create energy-efficient models, but at the cost of less accurate trees in comparison to their ensembles counterpart. Ensembles of Hoeffding trees, on the other hand, create a highly accurate forest of trees but consume five times more energy on average. An extension that tried to obtain similar results to ensembles of Hoeffding trees was the Extremely Fast Decision Tree (EFDT). This paper presents the Green Accelerated Hoeffding Tree (GAHT) algorithm, an extension of the EFDT algorithm with a lower energy and memory footprint and the same (or higher for some datasets) accuracy levels. GAHT grows the tree setting individual splitting criteria for each node, based on the distribution of the number of instances over each particular leaf. The results show that GAHT is able to achieve the same competitive accuracy results compared to EFDT and ensembles of Hoeffding trees while reducing the energy consumption up to 70%.
Abstract:Machine learning software accounts for a significant amount of energy consumed in data centers. These algorithms are usually optimized towards predictive performance, i.e. accuracy, and scalability. This is the case of data stream mining algorithms. Although these algorithms are adaptive to the incoming data, they have fixed parameters from the beginning of the execution. We have observed that having fixed parameters lead to unnecessary computations, thus making the algorithm energy inefficient. In this paper we present the nmin adaptation method for Hoeffding trees. This method adapts the value of the nmin parameter, which significantly affects the energy consumption of the algorithm. The method reduces unnecessary computations and memory accesses, thus reducing the energy, while the accuracy is only marginally affected. We experimentally compared VFDT (Very Fast Decision Tree, the first Hoeffding tree algorithm) and CVFDT (Concept-adapting VFDT) with the VFDT-nmin (VFDT with nmin adaptation). The results show that VFDT-nmin consumes up to 27% less energy than the standard VFDT, and up to 92% less energy than CVFDT, trading off a few percent of accuracy in a few datasets.