Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Ensemble of Distributed Learners for Online Classification of Dynamic Data Streams

Aug 24, 2013

Luca Canzian, Yu Zhang, Mihaela van der Schaar

Figure 1 for Ensemble of Distributed Learners for Online Classification of Dynamic Data Streams

Figure 2 for Ensemble of Distributed Learners for Online Classification of Dynamic Data Streams

Figure 3 for Ensemble of Distributed Learners for Online Classification of Dynamic Data Streams

Figure 4 for Ensemble of Distributed Learners for Online Classification of Dynamic Data Streams

Share this with someone who'll enjoy it:

Abstract:We present an efficient distributed online learning scheme to classify data captured from distributed, heterogeneous, and dynamic data sources. Our scheme consists of multiple distributed local learners, that analyze different streams of data that are correlated to a common event that needs to be classified. Each learner uses a local classifier to make a local prediction. The local predictions are then collected by each learner and combined using a weighted majority rule to output the final prediction. We propose a novel online ensemble learning algorithm to update the aggregation rule in order to adapt to the underlying data dynamics. We rigorously determine a bound for the worst case misclassification probability of our algorithm which depends on the misclassification probabilities of the best static aggregation rule, and of the best local classifier. Importantly, the worst case misclassification probability of our algorithm tends asymptotically to 0 if the misclassification probability of the best static aggregation rule or the misclassification probability of the best local classifier tend to 0. Then we extend our algorithm to address challenges specific to the distributed implementation and we prove new bounds that apply to these settings. Finally, we test our scheme by performing an evaluation study on several data sets. When applied to data sets widely used by the literature dealing with dynamic data streams and concept drift, our scheme exhibits performance gains ranging from 34% to 71% with respect to state of the art solutions.

* 14 pages, 5 figures, 2 tables

View paper on

Share this with someone who'll enjoy it:

Title:Ensemble of Distributed Learners for Online Classification of Dynamic Data Streams

Paper and Code