Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chaitanya Manapragada

An Eager Splitting Strategy for Online Decision Trees

Oct 20, 2020

Chaitanya Manapragada, Heitor M Gomes, Mahsa Salehi, Albert Bifet, Geoffrey I Webb

Figure 1 for An Eager Splitting Strategy for Online Decision Trees

Figure 2 for An Eager Splitting Strategy for Online Decision Trees

Figure 3 for An Eager Splitting Strategy for Online Decision Trees

Figure 4 for An Eager Splitting Strategy for Online Decision Trees

Abstract:We study the effectiveness of replacing the split strategy for the state-of-the-art online tree learner, Hoeffding Tree, with a rigorous but more eager splitting strategy. Our method, Hoeffding AnyTime Tree (HATT), uses the Hoeffding Test to determine whether the current best candidate split is superior to the current split, with the possibility of revision, while Hoeffding Tree aims to determine whether the top candidate is better than the second best and fixes it for all posterity. Our method converges to the ideal batch tree while Hoeffding Tree does not. Decision tree ensembles are widely used in practice, and in this work, we study the efficacy of HATT as a base learner for online bagging and online boosting ensembles. On UCI and synthetic streams, the success of Hoeffding AnyTime Tree in terms of prequential accuracy over Hoeffding Tree is established. HATT as a base learner component outperforms HT within a 0.05 significance level for the majority of tested ensembles on what we believe is the largest and most comprehensive set of testbenches in the online learning literature. Our results indicate that HATT is a superior alternative to Hoeffding Tree in a large number of ensemble settings.

* arXiv admin note: text overlap with arXiv:2010.08199

Via

Access Paper or Ask Questions

Emergent and Unspecified Behaviors in Streaming Decision Trees

Oct 16, 2020

Chaitanya Manapragada, Geoffrey I Webb, Mahsa Salehi, Albert Bifet

Figure 1 for Emergent and Unspecified Behaviors in Streaming Decision Trees

Figure 2 for Emergent and Unspecified Behaviors in Streaming Decision Trees

Figure 3 for Emergent and Unspecified Behaviors in Streaming Decision Trees

Figure 4 for Emergent and Unspecified Behaviors in Streaming Decision Trees

Abstract:Hoeffding trees are the state-of-the-art methods in decision tree learning for evolving data streams. These very fast decision trees are used in many real applications where data is created in real-time due to their efficiency. In this work, we extricate explanations for why these streaming decision tree algorithms for stationary and nonstationary streams (HoeffdingTree and HoeffdingAdaptiveTree) work as well as they do. In doing so, we identify thirteen unique unspecified design decisions in both the theoretical constructs and their implementations with substantial and consequential effects on predictive accuracy---design decisions that, without necessarily changing the essence of the algorithms, drive algorithm performance. We begin a larger conversation about explainability not just of the model but also of the processes responsible for an algorithm's success.

Via

Access Paper or Ask Questions

Extremely Fast Decision Tree

Feb 24, 2018

Chaitanya Manapragada, Geoff Webb, Mahsa Salehi

Figure 1 for Extremely Fast Decision Tree

Figure 2 for Extremely Fast Decision Tree

Figure 3 for Extremely Fast Decision Tree

Figure 4 for Extremely Fast Decision Tree

Abstract:We introduce a novel incremental decision tree learning algorithm, Hoeffding Anytime Tree, that is statistically more efficient than the current state-of-the-art, Hoeffding Tree. We demonstrate that an implementation of Hoeffding Anytime Tree---"Extremely Fast Decision Tree", a minor modification to the MOA implementation of Hoeffding Tree---obtains significantly superior prequential accuracy on most of the largest classification datasets from the UCI repository. Hoeffding Anytime Tree produces the asymptotic batch tree in the limit, is naturally resilient to concept drift, and can be used as a higher accuracy replacement for Hoeffding Tree in most scenarios, at a small additional computational cost.

* Submitted to KDD'2018

Via

Access Paper or Ask Questions