Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Heitor Gomes

Balancing Performance and Energy Consumption of Bagging Ensembles for the Classification of Data Streams in Edge Computing

Jan 17, 2022

Guilherme Cassales, Heitor Gomes, Albert Bifet, Bernhard Pfahringer, Hermes Senger

Figure 1 for Balancing Performance and Energy Consumption of Bagging Ensembles for the Classification of Data Streams in Edge Computing

Figure 2 for Balancing Performance and Energy Consumption of Bagging Ensembles for the Classification of Data Streams in Edge Computing

Figure 3 for Balancing Performance and Energy Consumption of Bagging Ensembles for the Classification of Data Streams in Edge Computing

Figure 4 for Balancing Performance and Energy Consumption of Bagging Ensembles for the Classification of Data Streams in Edge Computing

Abstract:In recent years, the Edge Computing (EC) paradigm has emerged as an enabling factor for developing technologies like the Internet of Things (IoT) and 5G networks, bridging the gap between Cloud Computing services and end-users, supporting low latency, mobility, and location awareness to delay-sensitive applications. Most solutions in EC employ machine learning (ML) methods to perform data classification and other information processing tasks on continuous and evolving data streams. Usually, such solutions have to cope with vast amounts of data that come as data streams while balancing energy consumption, latency, and the predictive performance of the algorithms. Ensemble methods achieve remarkable predictive performance when applied to evolving data streams due to the combination of several models and the possibility of selective resets. This work investigates strategies for optimizing the performance (i.e., delay, throughput) and energy consumption of bagging ensembles to classify data streams. The experimental evaluation involved six state-of-art ensemble algorithms (OzaBag, OzaBag Adaptive Size Hoeffding Tree, Online Bagging ADWIN, Leveraging Bagging, Adaptive RandomForest, and Streaming Random Patches) applying five widely used machine learning benchmark datasets with varied characteristics on three computer platforms. Such strategies can significantly reduce energy consumption in 96% of the experimental scenarios evaluated. Despite the trade-offs, it is possible to balance them to avoid significant loss in predictive performance.

* 18 pages. arXiv admin note: text overlap with arXiv:2112.09834

Via

Access Paper or Ask Questions

Improving the performance of bagging ensembles for data streams through mini-batching

Dec 18, 2021

Guilherme Cassales, Heitor Gomes, Albert Bifet, Bernhard Pfahringer, Hermes Senger

Figure 1 for Improving the performance of bagging ensembles for data streams through mini-batching

Figure 2 for Improving the performance of bagging ensembles for data streams through mini-batching

Figure 3 for Improving the performance of bagging ensembles for data streams through mini-batching

Figure 4 for Improving the performance of bagging ensembles for data streams through mini-batching

Abstract:Often, machine learning applications have to cope with dynamic environments where data are collected in the form of continuous data streams with potentially infinite length and transient behavior. Compared to traditional (batch) data mining, stream processing algorithms have additional requirements regarding computational resources and adaptability to data evolution. They must process instances incrementally because the data's continuous flow prohibits storing data for multiple passes. Ensemble learning achieved remarkable predictive performance in this scenario. Implemented as a set of (several) individual classifiers, ensembles are naturally amendable for task parallelism. However, the incremental learning and dynamic data structures used to capture the concept drift increase the cache misses and hinder the benefit of parallelism. This paper proposes a mini-batching strategy that can improve memory access locality and performance of several ensemble algorithms for stream mining in multi-core environments. With the aid of a formal framework, we demonstrate that mini-batching can significantly decrease the reuse distance (and the number of cache misses). Experiments on six different state-of-the-art ensemble algorithms applying four benchmark datasets with varied characteristics show speedups of up to 5X on 8-core processors. These benefits come at the expense of a small reduction in predictive performance.

* Information Sciences, Volume 580, 2021, Pages 260-282

Via

Access Paper or Ask Questions