Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Paul Boniol

Few Labels are all you need: A Weakly Supervised Framework for Appliance Localization in Smart-Meter Series

Jun 06, 2025

Adrien Petralia, Paul Boniol, Philippe Charpentier, Themis Palpanas

Abstract:Improving smart grid system management is crucial in the fight against climate change, and enabling consumers to play an active role in this effort is a significant challenge for electricity suppliers. In this regard, millions of smart meters have been deployed worldwide in the last decade, recording the main electricity power consumed in individual households. This data produces valuable information that can help them reduce their electricity footprint; nevertheless, the collected signal aggregates the consumption of the different appliances running simultaneously in the house, making it difficult to apprehend. Non-Intrusive Load Monitoring (NILM) refers to the challenge of estimating the power consumption, pattern, or on/off state activation of individual appliances using the main smart meter signal. Recent methods proposed to tackle this task are based on a fully supervised deep-learning approach that requires both the aggregate signal and the ground truth of individual appliance power. However, such labels are expensive to collect and extremely scarce in practice, as they require conducting intrusive surveys in households to monitor each appliance. In this paper, we introduce CamAL, a weakly supervised approach for appliance pattern localization that only requires information on the presence of an appliance in a household to be trained. CamAL merges an ensemble of deep-learning classifiers combined with an explainable classification method to be able to localize appliance patterns. Our experimental evaluation, conducted on 4 real-world datasets, demonstrates that CamAL significantly outperforms existing weakly supervised baselines and that current SotA fully supervised NILM approaches require significantly more labels to reach CamAL performances. The source of our experiments is available at: https://github.com/adrienpetralia/CamAL. This paper appeared in ICDE 2025.

* In 2025 IEEE 41st International Conference on Data Engineering (ICDE), Hong Kong, 2025, pp. 4386-4399
* 12 pages, 10 figures. This paper appeared in IEEE ICDE 2025

Via

Access Paper or Ask Questions

DeviceScope: An Interactive App to Detect and Localize Appliance Patterns in Electricity Consumption Time Series

Jun 06, 2025

Adrien Petralia, Paul Boniol, Philippe Charpentier, Themis Palpanas

Abstract:In recent years, electricity suppliers have installed millions of smart meters worldwide to improve the management of the smart grid system. These meters collect a large amount of electrical consumption data to produce valuable information to help consumers reduce their electricity footprint. However, having non-expert users (e.g., consumers or sales advisors) understand these data and derive usage patterns for different appliances has become a significant challenge for electricity suppliers because these data record the aggregated behavior of all appliances. At the same time, ground-truth labels (which could train appliance detection and localization models) are expensive to collect and extremely scarce in practice. This paper introduces DeviceScope, an interactive tool designed to facilitate understanding smart meter data by detecting and localizing individual appliance patterns within a given time period. Our system is based on CamAL (Class Activation Map-based Appliance Localization), a novel weakly supervised approach for appliance localization that only requires the knowledge of the existence of an appliance in a household to be trained. This paper appeared in ICDE 2025.

* In 2025 IEEE 41st International Conference on Data Engineering (ICDE), Hong Kong, 2025, pp. 4552-4555
* 4 pages, 5 figures. This paper appeared in ICDE 2025

Via

Access Paper or Ask Questions

Graphint: Graph-based Time Series Clustering Visualisation Tool

Mar 10, 2025

Paul Boniol, Donato Tiano, Angela Bonifati, Themis Palpanas

Abstract:With the exponential growth of time series data across diverse domains, there is a pressing need for effective analysis tools. Time series clustering is important for identifying patterns in these datasets. However, prevailing methods often encounter obstacles in maintaining data relationships and ensuring interpretability. We present Graphint, an innovative system based on the $k$-Graph methodology that addresses these challenges. Graphint integrates a robust time series clustering algorithm with an interactive tool for comparison and interpretation. More precisely, our system allows users to compare results against competing approaches, identify discriminative subsequences within specified datasets, and visualize the critical information utilized by $k$-Graph to generate outputs. Overall, Graphint offers a comprehensive solution for extracting actionable insights from complex temporal datasets.

Via

Access Paper or Ask Questions

$k$-Graph: A Graph Embedding for Interpretable Time Series Clustering

Feb 18, 2025

Paul Boniol, Donato Tiano, Angela Bonifati, Themis Palpanas

Abstract:Time series clustering poses a significant challenge with diverse applications across domains. A prominent drawback of existing solutions lies in their limited interpretability, often confined to presenting users with centroids. In addressing this gap, our work presents $k$-Graph, an unsupervised method explicitly crafted to augment interpretability in time series clustering. Leveraging a graph representation of time series subsequences, $k$-Graph constructs multiple graph representations based on different subsequence lengths. This feature accommodates variable-length time series without requiring users to predetermine subsequence lengths. Our experimental results reveal that $k$-Graph outperforms current state-of-the-art time series clustering algorithms in accuracy, while providing users with meaningful explanations and interpretations of the clustering outcomes.

Via

Access Paper or Ask Questions

VUS: Effective and Efficient Accuracy Measures for Time-Series Anomaly Detection

Feb 18, 2025

Paul Boniol, Ashwin K. Krishna, Marine Bruel, Qinghua Liu, Mingyi Huang, Themis Palpanas, Ruey S. Tsay, Aaron Elmore, Michael J. Franklin, John Paparrizos

Figure 1 for VUS: Effective and Efficient Accuracy Measures for Time-Series Anomaly Detection

Figure 2 for VUS: Effective and Efficient Accuracy Measures for Time-Series Anomaly Detection

Figure 3 for VUS: Effective and Efficient Accuracy Measures for Time-Series Anomaly Detection

Figure 4 for VUS: Effective and Efficient Accuracy Measures for Time-Series Anomaly Detection

Abstract:Anomaly detection (AD) is a fundamental task for time-series analytics with important implications for the downstream performance of many applications. In contrast to other domains where AD mainly focuses on point-based anomalies (i.e., outliers in standalone observations), AD for time series is also concerned with range-based anomalies (i.e., outliers spanning multiple observations). Nevertheless, it is common to use traditional point-based information retrieval measures, such as Precision, Recall, and F-score, to assess the quality of methods by thresholding the anomaly score to mark each point as an anomaly or not. However, mapping discrete labels into continuous data introduces unavoidable shortcomings, complicating the evaluation of range-based anomalies. Notably, the choice of evaluation measure may significantly bias the experimental outcome. Despite over six decades of attention, there has never been a large-scale systematic quantitative and qualitative analysis of time-series AD evaluation measures. This paper extensively evaluates quality measures for time-series AD to assess their robustness under noise, misalignments, and different anomaly cardinality ratios. Our results indicate that measures producing quality values independently of a threshold (i.e., AUC-ROC and AUC-PR) are more suitable for time-series AD. Motivated by this observation, we first extend the AUC-based measures to account for range-based anomalies. Then, we introduce a new family of parameter-free and threshold-independent measures, Volume Under the Surface (VUS), to evaluate methods while varying parameters. We also introduce two optimized implementations for VUS that reduce significantly the execution time of the initial implementation. Our findings demonstrate that our four measures are significantly more robust in assessing the quality of time-series AD methods.

Via

Access Paper or Ask Questions

Dive into Time-Series Anomaly Detection: A Decade Review

Dec 29, 2024

Paul Boniol, Qinghua Liu, Mingyi Huang, Themis Palpanas, John Paparrizos

Abstract:Recent advances in data collection technology, accompanied by the ever-rising volume and velocity of streaming data, underscore the vital need for time series analytics. In this regard, time-series anomaly detection has been an important activity, entailing various applications in fields such as cyber security, financial markets, law enforcement, and health care. While traditional literature on anomaly detection is centered on statistical measures, the increasing number of machine learning algorithms in recent years call for a structured, general characterization of the research methods for time-series anomaly detection. This survey groups and summarizes anomaly detection existing solutions under a process-centric taxonomy in the time series context. In addition to giving an original categorization of anomaly detection methods, we also perform a meta-analysis of the literature and outline general trends in time-series anomaly detection research.

Via

Access Paper or Ask Questions

Appliance Detection Using Very Low-Frequency Smart Meter Time Series

May 21, 2023

Adrien Petralia, Philippe Charpentier, Paul Boniol, Themis Palpanas

Abstract:In recent years, smart meters have been widely adopted by electricity suppliers to improve the management of the smart grid system. These meters usually collect energy consumption data at a very low frequency (every 30min), enabling utilities to bill customers more accurately. To provide more personalized recommendations, the next step is to detect the appliances owned by customers, which is a challenging problem, due to the very-low meter reading frequency. Even though the appliance detection problem can be cast as a time series classification problem, with many such classifiers having been proposed in the literature, no study has applied and compared them on this specific problem. This paper presents an in-depth evaluation and comparison of state-of-the-art time series classifiers applied to detecting the presence/absence of diverse appliances in very low-frequency smart meter data. We report results with five real datasets. We first study the impact of the detection quality of 13 different appliances using 30min sampled data, and we subsequently propose an analysis of the possible detection performance gain by using a higher meter reading frequency. The results indicate that the performance of current time series classifiers varies significantly. Some of them, namely deep learning-based classifiers, provide promising results in terms of accuracy (especially for certain appliances), even using 30min sampled data, and are scalable to the large smart meter time series collections of energy consumption data currently available to electricity suppliers. Nevertheless, our study shows that more work is needed in this area to further improve the accuracy of the proposed solutions. This paper appeared in ACM e-Energy 2023.

* 11 pages, 7 figures. This paper appeared in ACM e-Energy 2023

Via

Access Paper or Ask Questions

Series2Graph: Graph-based Subsequence Anomaly Detection for Time Series

Jul 25, 2022

Paul Boniol, Themis Palpanas

Figure 1 for Series2Graph: Graph-based Subsequence Anomaly Detection for Time Series

Figure 2 for Series2Graph: Graph-based Subsequence Anomaly Detection for Time Series

Figure 3 for Series2Graph: Graph-based Subsequence Anomaly Detection for Time Series

Figure 4 for Series2Graph: Graph-based Subsequence Anomaly Detection for Time Series

Abstract:Subsequence anomaly detection in long sequences is an important problem with applications in a wide range of domains. However, the approaches proposed so far in the literature have severe limitations: they either require prior domain knowledge used to design the anomaly discovery algorithms, or become cumbersome and expensive to use in situations with recurrent anomalies of the same type. In this work, we address these problems, and propose an unsupervised method suitable for domain agnostic subsequence anomaly detection. Our method, Series2Graph, is based on a graph representation of a novel low-dimensionality embedding of subsequences. Series2Graph needs neither labeled instances (like supervised techniques) nor anomaly-free data (like zero-positive learning techniques), and identifies anomalies of varying lengths. The experimental results, on the largest set of synthetic and real datasets used to date, demonstrate that the proposed approach correctly identifies single and recurrent anomalies without any prior knowledge of their characteristics, outperforming by a large margin several competing approaches in accuracy, while being up to orders of magnitude faster. This paper has appeared in VLDB 2020.

* Proceedings of the VLDB Endowment, Volume 13, Issue 12, Pages 1821-1834, 2020

Via

Access Paper or Ask Questions

dCAM: Dimension-wise Class Activation Map for Explaining Multivariate Data Series Classification

Jul 25, 2022

Paul Boniol, Mohammed Meftah, Emmanuel Remy, Themis Palpanas

Figure 1 for dCAM: Dimension-wise Class Activation Map for Explaining Multivariate Data Series Classification

Figure 2 for dCAM: Dimension-wise Class Activation Map for Explaining Multivariate Data Series Classification

Figure 3 for dCAM: Dimension-wise Class Activation Map for Explaining Multivariate Data Series Classification

Figure 4 for dCAM: Dimension-wise Class Activation Map for Explaining Multivariate Data Series Classification

Abstract:Data series classification is an important and challenging problem in data science. Explaining the classification decisions by finding the discriminant parts of the input that led the algorithm to some decisions is a real need in many applications. Convolutional neural networks perform well for the data series classification task; though, the explanations provided by this type of algorithm are poor for the specific case of multivariate data series. Addressing this important limitation is a significant challenge. In this paper, we propose a novel method that solves this problem by highlighting both the temporal and dimensional discriminant information. Our contribution is two-fold: we first describe a convolutional architecture that enables the comparison of dimensions; then, we propose a method that returns dCAM, a Dimension-wise Class Activation Map specifically designed for multivariate time series (and CNN-based models). Experiments with several synthetic and real datasets demonstrate that dCAM is not only more accurate than previous approaches, but the only viable solution for discriminant feature discovery and classification explanation in multivariate time series. This paper has appeared in SIGMOD'22.

* Proceedings of the 2022 International Conference on Management of Data (SIGMOD '22), June 12--17, 2022, Philadelphia, PA, USA

Via

Access Paper or Ask Questions

Performance in the Courtroom: Automated Processing and Visualization of Appeal Court Decisions in France

Jul 09, 2020

Paul Boniol, George Panagopoulos, Christos Xypolopoulos, Rajaa El Hamdani, David Restrepo Amariles, Michalis Vazirgiannis

Figure 1 for Performance in the Courtroom: Automated Processing and Visualization of Appeal Court Decisions in France

Figure 2 for Performance in the Courtroom: Automated Processing and Visualization of Appeal Court Decisions in France

Figure 3 for Performance in the Courtroom: Automated Processing and Visualization of Appeal Court Decisions in France

Figure 4 for Performance in the Courtroom: Automated Processing and Visualization of Appeal Court Decisions in France

Abstract:Artificial Intelligence techniques are already popular and important in the legal domain. We extract legal indicators from judicial judgment to decrease the asymmetry of information of the legal system and the access-to-justice gap. We use NLP methods to extract interesting entities/data from judgments to construct networks of lawyers and judgments. We propose metrics to rank lawyers based on their experience, wins/loss ratio and their importance in the network of lawyers. We also perform community detection in the network of judgments and propose metrics to represent the difficulty of cases capitalising on communities features.

Via

Access Paper or Ask Questions