Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kleanthis Malialis

SiameseDuo++: Active Learning from Data Streams with Dual Augmented Siamese Networks

Apr 06, 2025

Kleanthis Malialis, Stylianos Filippou, Christos G. Panayiotou, Marios M. Polycarpou

Abstract:Data stream mining, also known as stream learning, is a growing area which deals with learning from high-speed arriving data. Its relevance has surged recently due to its wide range of applicability, such as, critical infrastructure monitoring, social media analysis, and recommender systems. The design of stream learning methods faces significant research challenges; from the nonstationary nature of the data (referred to as concept drift) and the fact that data streams are typically not annotated with the ground truth, to the requirement that such methods should process large amounts of data in real-time with limited memory. This work proposes the SiameseDuo++ method, which uses active learning to automatically select instances for a human expert to label according to a budget. Specifically, it incrementally trains two siamese neural networks which operate in synergy, augmented by generated examples. Both the proposed active learning strategy and augmentation operate in the latent space. SiameseDuo++ addresses the aforementioned challenges by operating with limited memory and limited labelling budget. Simulation experiments show that the proposed method outperforms strong baselines and state-of-the-art methods in terms of learning speed and/or performance. To promote open science we publicly release our code and datasets.

* Neurocomputing, Volume 637, 2025, 130083, ISSN 0925-2312

Via

Access Paper or Ask Questions

Transformer-based Multivariate Time Series Anomaly Localization

Jan 15, 2025

Charalampos Shimillas, Kleanthis Malialis, Konstantinos Fokianos, Marios M. Polycarpou

Figure 1 for Transformer-based Multivariate Time Series Anomaly Localization

Figure 2 for Transformer-based Multivariate Time Series Anomaly Localization

Figure 3 for Transformer-based Multivariate Time Series Anomaly Localization

Figure 4 for Transformer-based Multivariate Time Series Anomaly Localization

Abstract:With the growing complexity of Cyber-Physical Systems (CPS) and the integration of Internet of Things (IoT), the use of sensors for online monitoring generates large volume of multivariate time series (MTS) data. Consequently, the need for robust anomaly diagnosis in MTS is paramount to maintaining system reliability and safety. While significant advancements have been made in anomaly detection, localization remains a largely underexplored area, though crucial for intelligent decision-making. This paper introduces a novel transformer-based model for unsupervised anomaly diagnosis in MTS, with a focus on improving localization performance, through an in-depth analysis of the self-attention mechanism's learning behavior under both normal and anomalous conditions. We formulate the anomaly localization problem as a three-stage process: time-step, window, and segment-based. This leads to the development of the Space-Time Anomaly Score (STAS), a new metric inspired by the connection between transformer latent representations and space-time statistical models. STAS is designed to capture individual anomaly behaviors and inter-series dependencies, delivering enhanced localization performance. Additionally, the Statistical Feature Anomaly Score (SFAS) complements STAS by analyzing statistical features around anomalies, with their combination helping to reduce false alarms. Experiments on real world and synthetic datasets illustrate the model's superiority over state-of-the-art methods in both detection and localization tasks.

Via

Access Paper or Ask Questions

Online Detection of Water Contamination Under Concept Drift

Jan 03, 2025

Jin Li, Kleanthis Malialis, Stelios G. Vrachimis, Marios M. Polycarpou

Figure 1 for Online Detection of Water Contamination Under Concept Drift

Figure 2 for Online Detection of Water Contamination Under Concept Drift

Figure 3 for Online Detection of Water Contamination Under Concept Drift

Figure 4 for Online Detection of Water Contamination Under Concept Drift

Abstract:Water Distribution Networks (WDNs) are vital infrastructures, and contamination poses serious public health risks. Harmful substances can interact with disinfectants like chlorine, making chlorine monitoring essential for detecting contaminants. However, chlorine sensors often become unreliable and require frequent calibration. This study introduces the Dual-Threshold Anomaly and Drift Detection (AD&DD) method, an unsupervised approach combining a dual-threshold drift detection mechanism with an LSTM-based Variational Autoencoder(LSTM-VAE) for real-time contamination detection. Tested on two realistic WDNs, AD&DD effectively identifies anomalies with sensor offsets as concept drift, and outperforms other methods. A proposed decentralized architecture enables accurate contamination detection and localization by deploying AD&DD on selected nodes.

Via

Access Paper or Ask Questions

Urban Water Consumption Forecasting Using Deep Learning and Correlated District Metered Areas

Dec 30, 2024

Kleanthis Malialis, Nefeli Mavri, Stelios G. Vrachimis, Marios S. Kyriakou, Demetrios G. Eliades, Marios M. Polycarpou

Figure 1 for Urban Water Consumption Forecasting Using Deep Learning and Correlated District Metered Areas

Figure 2 for Urban Water Consumption Forecasting Using Deep Learning and Correlated District Metered Areas

Figure 3 for Urban Water Consumption Forecasting Using Deep Learning and Correlated District Metered Areas

Figure 4 for Urban Water Consumption Forecasting Using Deep Learning and Correlated District Metered Areas

Abstract:Accurate water consumption forecasting is a crucial tool for water utilities and policymakers, as it helps ensure a reliable supply, optimize operations, and support infrastructure planning. Urban Water Distribution Networks (WDNs) are divided into District Metered Areas (DMAs), where water flow is monitored to efficiently manage resources. This work focuses on short-term forecasting of DMA consumption using deep learning and aims to address two key challenging issues. First, forecasting based solely on a DMA's historical data may lack broader context and provide limited insights. Second, DMAs may experience sensor malfunctions providing incorrect data, or some DMAs may not be monitored at all due to computational costs, complicating accurate forecasting. We propose a novel method that first identifies DMAs with correlated consumption patterns and then uses these patterns, along with the DMA's local data, as input to a deep learning model for forecasting. In a real-world study with data from five DMAs, we show that: i) the deep learning model outperforms a classical statistical model; ii) accurate forecasting can be carried out using only correlated DMAs' consumption patterns; and iii) even when a DMA's local data is available, including correlated DMAs' data improves accuracy.

* Keywords: urban water management, water consumption, time series forecasting

Via

Access Paper or Ask Questions

Incremental Learning with Concept Drift Detection and Prototype-based Embeddings for Graph Stream Classification

Apr 12, 2024

Kleanthis Malialis, Jin Li, Christos G. Panayiotou, Marios M. Polycarpou

Figure 1 for Incremental Learning with Concept Drift Detection and Prototype-based Embeddings for Graph Stream Classification

Figure 2 for Incremental Learning with Concept Drift Detection and Prototype-based Embeddings for Graph Stream Classification

Figure 3 for Incremental Learning with Concept Drift Detection and Prototype-based Embeddings for Graph Stream Classification

Figure 4 for Incremental Learning with Concept Drift Detection and Prototype-based Embeddings for Graph Stream Classification

Abstract:Data stream mining aims at extracting meaningful knowledge from continually evolving data streams, addressing the challenges posed by nonstationary environments, particularly, concept drift which refers to a change in the underlying data distribution over time. Graph structures offer a powerful modelling tool to represent complex systems, such as, critical infrastructure systems and social networks. Learning from graph streams becomes a necessity to understand the dynamics of graph structures and to facilitate informed decision-making. This work introduces a novel method for graph stream classification which operates under the general setting where a data generating process produces graphs with varying nodes and edges over time. The method uses incremental learning for continual model adaptation, selecting representative graphs (prototypes) for each class, and creating graph embeddings. Additionally, it incorporates a loss-based concept drift detection mechanism to recalculate graph prototypes when drift is detected.

* IEEE World Congress on Computational Intelligence (WCCI) 2024; Keywords: graph streams, concept drift, incremental learning, graph prototypes, nonstationary environments

Via

Access Paper or Ask Questions

A Study of Data-driven Methods for Adaptive Forecasting of COVID-19 Cases

Sep 18, 2023

Charithea Stylianides, Kleanthis Malialis, Panayiotis Kolios

Abstract:Severe acute respiratory disease SARS-CoV-2 has had a found impact on public health systems and healthcare emergency response especially with respect to making decisions on the most effective measures to be taken at any given time. As demonstrated throughout the last three years with COVID-19, the prediction of the number of positive cases can be an effective way to facilitate decision-making. However, the limited availability of data and the highly dynamic and uncertain nature of the virus transmissibility makes this task very challenging. Aiming at investigating these challenges and in order to address this problem, this work studies data-driven (learning, statistical) methods for incrementally training models to adapt to these nonstationary conditions. An extensive empirical study is conducted to examine various characteristics, such as, performance analysis on a per virus wave basis, feature extraction, "lookback" window size, memory size, all for next-, 7-, and 14-day forecasting tasks. We demonstrate that the incremental learning framework can successfully address the aforementioned challenges and perform well during outbreaks, providing accurate predictions.

* International Conference on Artificial Neural Networks (ICANN), 2023
* Keywords: incremental learning, data streams, neural networks, time-series forecasting

Via

Access Paper or Ask Questions

Autoencoder-based Anomaly Detection in Streaming Data with Incremental Learning and Concept Drift Adaptation

May 15, 2023

Jin Li, Kleanthis Malialis, Marios M. Polycarpou

Figure 1 for Autoencoder-based Anomaly Detection in Streaming Data with Incremental Learning and Concept Drift Adaptation

Figure 2 for Autoencoder-based Anomaly Detection in Streaming Data with Incremental Learning and Concept Drift Adaptation

Figure 3 for Autoencoder-based Anomaly Detection in Streaming Data with Incremental Learning and Concept Drift Adaptation

Figure 4 for Autoencoder-based Anomaly Detection in Streaming Data with Incremental Learning and Concept Drift Adaptation

Abstract:In our digital universe nowadays, enormous amount of data are produced in a streaming manner in a variety of application areas. These data are often unlabelled. In this case, identifying infrequent events, such as anomalies, poses a great challenge. This problem becomes even more difficult in non-stationary environments, which can cause deterioration of the predictive performance of a model. To address the above challenges, the paper proposes an autoencoder-based incremental learning method with drift detection (strAEm++DD). Our proposed method strAEm++DD leverages on the advantages of both incremental learning and drift detection. We conduct an experimental study using real-world and synthetic datasets with severe or extreme class imbalance, and provide an empirical analysis of strAEm++DD. We further conduct a comparative study, showing that the proposed method significantly outperforms existing baseline and advanced methods.

* accepted by 'The International Joint Conference on Neural Networks (IJCNN)2023'

Via

Access Paper or Ask Questions

Unsupervised Unlearning of Concept Drift with Autoencoders

Nov 23, 2022

André Artelt, Kleanthis Malialis, Christos Panayiotou, Marios Polycarpou, Barbara Hammer

Abstract:The phenomena of concept drift refers to a change of the data distribution affecting the data stream of future samples -- such non-stationary environments are often encountered in the real world. Consequently, learning models operating on the data stream might become obsolete, and need costly and difficult adjustments such as retraining or adaptation. Existing methods to address concept drift are, typically, categorised as active or passive. The former continually adapt a model using incremental learning, while the latter perform a complete model retraining when a drift detection mechanism triggers an alarm. We depart from the traditional avenues and propose for the first time an alternative approach which "unlearns" the effects of the concept drift. Specifically, we propose an autoencoder-based method for "unlearning" the concept drift in an unsupervised manner, without having to retrain or adapt any of the learning models operating on the data.

Via

Access Paper or Ask Questions

Data augmentation on-the-fly and active learning in data stream classification

Oct 13, 2022

Kleanthis Malialis, Dimitris Papatheodoulou, Stylianos Filippou, Christos G. Panayiotou, Marios M. Polycarpou

Figure 1 for Data augmentation on-the-fly and active learning in data stream classification

Figure 2 for Data augmentation on-the-fly and active learning in data stream classification

Figure 3 for Data augmentation on-the-fly and active learning in data stream classification

Figure 4 for Data augmentation on-the-fly and active learning in data stream classification

Abstract:There is an emerging need for predictive models to be trained on-the-fly, since in numerous machine learning applications data are arriving in an online fashion. A critical challenge encountered is that of limited availability of ground truth information (e.g., labels in classification tasks) as new data are observed one-by-one online, while another significant challenge is that of class imbalance. This work introduces the novel Augmented Queues method, which addresses the dual-problem by combining in a synergistic manner online active learning, data augmentation, and a multi-queue memory to maintain separate and balanced queues for each class. We perform an extensive experimental study using image and time-series augmentations, in which we examine the roles of the active learning budget, memory size, imbalance level, and neural network type. We demonstrate two major advantages of Augmented Queues. First, it does not reserve additional memory space as the generation of synthetic data occurs only at training times. Second, learning models have access to more labelled data without the need to increase the active learning budget and / or the original memory size. Learning on-the-fly poses major challenges which, typically, hinder the deployment of learning models. Augmented Queues significantly improves the performance in terms of learning quality and speed. Our code is made publicly available.

* IEEE Symposium Series on Computational Intelligence (SSCI), 2022
* Keywords: incremental learning, active learning, data streams, class imbalance, neural networks

Via

Access Paper or Ask Questions

A Hybrid Active-Passive Approach to Imbalanced Nonstationary Data Stream Classification

Oct 12, 2022

Kleanthis Malialis, Manuel Roveri, Cesare Alippi, Christos G. Panayiotou, Marios M. Polycarpou

Figure 1 for A Hybrid Active-Passive Approach to Imbalanced Nonstationary Data Stream Classification

Figure 2 for A Hybrid Active-Passive Approach to Imbalanced Nonstationary Data Stream Classification

Figure 3 for A Hybrid Active-Passive Approach to Imbalanced Nonstationary Data Stream Classification

Figure 4 for A Hybrid Active-Passive Approach to Imbalanced Nonstationary Data Stream Classification

Abstract:In real-world applications, the process generating the data might suffer from nonstationary effects (e.g., due to seasonality, faults affecting sensors or actuators, and changes in the users' behaviour). These changes, often called concept drift, might induce severe (potentially catastrophic) impacts on trained learning models that become obsolete over time, and inadequate to solve the task at hand. Learning in presence of concept drift aims at designing machine and deep learning models that are able to track and adapt to concept drift. Typically, techniques to handle concept drift are either active or passive, and traditionally, these have been considered to be mutually exclusive. Active techniques use an explicit drift detection mechanism, and re-train the learning algorithm when concept drift is detected. Passive techniques use an implicit method to deal with drift, and continually update the model using incremental learning. Differently from what present in the literature, we propose a hybrid alternative which merges the two approaches, hence, leveraging on their advantages. The proposed method called Hybrid-Adaptive REBAlancing (HAREBA) significantly outperforms strong baselines and state-of-the-art methods in terms of learning quality and speed; we experiment how it is effective under severe class imbalance levels too.

* IEEE Symposium Series on Computational Intelligence (SSCI), 2022
* Keywords: incremental learning, concept drift, class imbalance, data streams, nonstationary environments

Via

Access Paper or Ask Questions