Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bartolomeo Vacchetti

Unsupervised Concept Drift Detection from Deep Learning Representations in Real-time

Jun 24, 2024

Salvatore Greco, Bartolomeo Vacchetti, Daniele Apiletti, Tania Cerquitelli

Figure 1 for Unsupervised Concept Drift Detection from Deep Learning Representations in Real-time

Figure 2 for Unsupervised Concept Drift Detection from Deep Learning Representations in Real-time

Figure 3 for Unsupervised Concept Drift Detection from Deep Learning Representations in Real-time

Figure 4 for Unsupervised Concept Drift Detection from Deep Learning Representations in Real-time

Abstract:Concept Drift is a phenomenon in which the underlying data distribution and statistical properties of a target domain change over time, leading to a degradation of the model's performance. Consequently, models deployed in production require continuous monitoring through drift detection techniques. Most drift detection methods to date are supervised, i.e., based on ground-truth labels. However, true labels are usually not available in many real-world scenarios. Although recent efforts have been made to develop unsupervised methods, they often lack the required accuracy, have a complexity that makes real-time implementation in production environments difficult, or are unable to effectively characterize drift. To address these challenges, we propose DriftLens, an unsupervised real-time concept drift detection framework. It works on unstructured data by exploiting the distribution distances of deep learning representations. DriftLens can also provide drift characterization by analyzing each label separately. A comprehensive experimental evaluation is presented with multiple deep learning classifiers for text, image, and speech. Results show that (i) DriftLens performs better than previous methods in detecting drift in $11/13$ use cases; (ii) it runs at least 5 times faster; (iii) its detected drift value is very coherent with the amount of drift (correlation $\geq 0.85$); (iv) it is robust to parameter changes.

Via

Access Paper or Ask Questions