Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Srikanta Tirthapura

DriftSurf: A Risk-competitive Learning Algorithm under Concept Drift

Mar 13, 2020

Ashraf Tahmasbi, Ellango Jothimurugesan, Srikanta Tirthapura, Phillip B. Gibbons

Figure 1 for DriftSurf: A Risk-competitive Learning Algorithm under Concept Drift

Figure 2 for DriftSurf: A Risk-competitive Learning Algorithm under Concept Drift

Figure 3 for DriftSurf: A Risk-competitive Learning Algorithm under Concept Drift

Figure 4 for DriftSurf: A Risk-competitive Learning Algorithm under Concept Drift

Abstract:When learning from streaming data, a change in the data distribution, also known as concept drift, can render a previously-learned model inaccurate and require training a new model. We present an adaptive learning algorithm that extends previous drift-detection-based methods by incorporating drift detection into a broader stable-state/reactive-state process. The advantage of our approach is that we can use aggressive drift detection in the stable state to achieve a high detection rate, but mitigate the false positive rate of standalone drift detection via a reactive state that reacts quickly to true drifts while eliminating most false positives. The algorithm is generic in its base learner and can be applied across a variety of supervised learning problems. Our theoretical analysis shows that the risk of the algorithm is competitive to an algorithm with oracle knowledge of when (abrupt) drifts occur. Experiments on synthetic and real datasets with concept drifts confirm our theoretical analysis.

* 25 pages, 11 figures. Submitted to International Conference on Machine Learning (ICML) 2020

Via

Access Paper or Ask Questions

Learning Graphical Models from a Distributed Stream

Oct 05, 2017

Yu Zhang, Srikanta Tirthapura, Graham Cormode

Figure 1 for Learning Graphical Models from a Distributed Stream

Figure 2 for Learning Graphical Models from a Distributed Stream

Figure 3 for Learning Graphical Models from a Distributed Stream

Figure 4 for Learning Graphical Models from a Distributed Stream

Abstract:A current challenge for data management systems is to support the construction and maintenance of machine learning models over data that is large, multi-dimensional, and evolving. While systems that could support these tasks are emerging, the need to scale to distributed, streaming data requires new models and algorithms. In this setting, as well as computational scalability and model accuracy, we also need to minimize the amount of communication between distributed processors, which is the chief component of latency. We study Bayesian networks, the workhorse of graphical models, and present a communication-efficient method for continuously learning and maintaining a Bayesian network model over data that is arriving as a distributed stream partitioned across multiple processors. We show a strategy for maintaining model parameters that leads to an exponential reduction in communication when compared with baseline approaches to maintain the exact MLE (maximum likelihood estimation). Meanwhile, our strategy provides similar prediction errors for the target distribution and for classification tasks.

Via

Access Paper or Ask Questions