Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pedro Casas

Towards Foundation Auto-Encoders for Time-Series Anomaly Detection

Jul 02, 2025

Gastón García González, Pedro Casas, Emilio Martínez, Alicia Fernández

Abstract:We investigate a novel approach to time-series modeling, inspired by the successes of large pretrained foundation models. We introduce FAE (Foundation Auto-Encoders), a foundation generative-AI model for anomaly detection in time-series data, based on Variational Auto-Encoders (VAEs). By foundation, we mean a model pretrained on massive amounts of time-series data which can learn complex temporal patterns useful for accurate modeling, forecasting, and detection of anomalies on previously unseen datasets. FAE leverages VAEs and Dilated Convolutional Neural Networks (DCNNs) to build a generic model for univariate time-series modeling, which could eventually perform properly in out-of-the-box, zero-shot anomaly detection applications. We introduce the main concepts of FAE, and present preliminary results in different multi-dimensional time-series datasets from various domains, including a real dataset from an operational mobile ISP, and the well known KDD 2021 Anomaly Detection dataset.

* Presented at ACM KDD 2024, MiLeTS 2024 Workshop, August 25, 2024, Barcelona, Spain

Via

Access Paper or Ask Questions

Smart Active Sampling to enhance Quality Assurance Efficiency

Sep 23, 2022

Clemens Heistracher, Stefan Stricker, Pedro Casas, Daniel Schall, Jana Kemnitz

Figure 1 for Smart Active Sampling to enhance Quality Assurance Efficiency

Figure 2 for Smart Active Sampling to enhance Quality Assurance Efficiency

Figure 3 for Smart Active Sampling to enhance Quality Assurance Efficiency

Figure 4 for Smart Active Sampling to enhance Quality Assurance Efficiency

Abstract:We propose a new sampling strategy, called smart active sapling, for quality inspections outside the production line. Based on the principles of active learning a machine learning model decides which samples are sent to quality inspection. On the one hand, this minimizes the production of scrap parts due to earlier detection of quality violations. On the other hand, quality inspection costs are reduced for smooth operation.

Via

Access Paper or Ask Questions

On the Usage of Generative Models for Network Anomaly Detection in Multivariate Time-Series

Oct 16, 2020

Gastón García González, Pedro Casas, Alicia Fernández, Gabriel Gómez

Figure 1 for On the Usage of Generative Models for Network Anomaly Detection in Multivariate Time-Series

Figure 2 for On the Usage of Generative Models for Network Anomaly Detection in Multivariate Time-Series

Figure 3 for On the Usage of Generative Models for Network Anomaly Detection in Multivariate Time-Series

Figure 4 for On the Usage of Generative Models for Network Anomaly Detection in Multivariate Time-Series

Abstract:Despite the many attempts and approaches for anomaly detection explored over the years, the automatic detection of rare events in data communication networks remains a complex problem. In this paper we introduce Net-GAN, a novel approach to network anomaly detection in time-series, using recurrent neural networks (RNNs) and generative adversarial networks (GAN). Different from the state of the art, which traditionally focuses on univariate measurements, Net-GAN detects anomalies in multivariate time-series, exploiting temporal dependencies through RNNs. Net-GAN discovers the underlying distribution of the baseline, multivariate data, without making any assumptions on its nature, offering a powerful approach to detect anomalies in complex, difficult to model network monitoring data. We further exploit the concepts behind generative models to conceive Net-VAE, a complementary approach to Net-GAN for network anomaly detection, based on variational auto-encoders (VAE). We evaluate Net-GAN and Net-VAE in different monitoring scenarios, including anomaly detection in IoT sensor data, and intrusion detection in network measurements. Generative models represent a promising approach for network anomaly detection, especially when considering the complexity and ever-growing number of time-series to monitor in operational networks.

* To be published in ACM SIGMETRICS Performance Evaluation Review

Via

Access Paper or Ask Questions

DeepMAL -- Deep Learning Models for Malware Traffic Detection and Classification

Mar 10, 2020

Gonzalo Marín, Pedro Casas, Germán Capdehourat

Figure 1 for DeepMAL -- Deep Learning Models for Malware Traffic Detection and Classification

Figure 2 for DeepMAL -- Deep Learning Models for Malware Traffic Detection and Classification

Figure 3 for DeepMAL -- Deep Learning Models for Malware Traffic Detection and Classification

Figure 4 for DeepMAL -- Deep Learning Models for Malware Traffic Detection and Classification

Abstract:Robust network security systems are essential to prevent and mitigate the harming effects of the ever-growing occurrence of network attacks. In recent years, machine learning-based systems have gain popularity for network security applications, usually considering the application of shallow models, which rely on the careful engineering of expert, handcrafted input features. The main limitation of this approach is that handcrafted features can fail to perform well under different scenarios and types of attacks. Deep Learning (DL) models can solve this limitation using their ability to learn feature representations from raw, non-processed data. In this paper we explore the power of DL models on the specific problem of detection and classification of malware network traffic. As a major advantage with respect to the state of the art, we consider raw measurements coming directly from the stream of monitored bytes as input to the proposed models, and evaluate different raw-traffic feature representations, including packet and flow-level ones. We introduce DeepMAL, a DL model which is able to capture the underlying statistics of malicious traffic, without any sort of expert handcrafted features. Using publicly available traffic traces containing different families of malware traffic, we show that DeepMAL can detect and classify malware flows with high accuracy, outperforming traditional, shallow-like models.

* 3rd International Data Science Conference (IDSC 2020)

Via

Access Paper or Ask Questions

EXPLAIN-IT: Towards Explainable AI for Unsupervised Network Traffic Analysis

Mar 03, 2020

Andrea Morichetta, Pedro Casas, Marco Mellia

Figure 1 for EXPLAIN-IT: Towards Explainable AI for Unsupervised Network Traffic Analysis

Figure 2 for EXPLAIN-IT: Towards Explainable AI for Unsupervised Network Traffic Analysis

Figure 3 for EXPLAIN-IT: Towards Explainable AI for Unsupervised Network Traffic Analysis

Figure 4 for EXPLAIN-IT: Towards Explainable AI for Unsupervised Network Traffic Analysis

Abstract:The application of unsupervised learning approaches, and in particular of clustering techniques, represents a powerful exploration means for the analysis of network measurements. Discovering underlying data characteristics, grouping similar measurements together, and identifying eventual patterns of interest are some of the applications which can be tackled through clustering. Being unsupervised, clustering does not always provide precise and clear insight into the produced output, especially when the input data structure and distribution are complex and difficult to grasp. In this paper we introduce EXPLAIN-IT, a methodology which deals with unlabeled data, creates meaningful clusters, and suggests an explanation to the clustering results for the end-user. EXPLAIN-IT relies on a novel explainable Artificial Intelligence (AI) approach, which allows to understand the reasons leading to a particular decision of a supervised learning-based model, additionally extending its application to the unsupervised learning domain. We apply EXPLAIN-IT to the problem of YouTube video quality classification under encrypted traffic scenarios, showing promising results.

Via

Access Paper or Ask Questions

Two Decades of AI4NETS-AI/ML for Data Networks: Challenges & Research Directions

Mar 03, 2020

Pedro Casas

Figure 1 for Two Decades of AI4NETS-AI/ML for Data Networks: Challenges & Research Directions

Figure 2 for Two Decades of AI4NETS-AI/ML for Data Networks: Challenges & Research Directions

Abstract:The popularity of Artificial Intelligence (AI) -- and of Machine Learning (ML) as an approach to AI, has dramatically increased in the last few years, due to its outstanding performance in various domains, notably in image, audio, and natural language processing. In these domains, AI success-stories are boosting the applied field. When it comes to AI/ML for data communication Networks (AI4NETS), and despite the many attempts to turn networks into learning agents, the successful application of AI/ML in networking is limited. There is a strong resistance against AI/ML-based solutions, and a striking gap between the extensive academic research and the actual deployments of such AI/ML-based systems in operational environments. The truth is, there are still many unsolved complex challenges associated to the analysis of networking data through AI/ML, which hinders its acceptability and adoption in the practice. In this positioning paper I elaborate on the most important show-stoppers in AI4NETS, and present a research agenda to tackle some of these challenges, enabling a natural adoption of AI/ML for networking. In particular, I focus the future research in AI4NETS around three major pillars: (i) to make AI/ML immediately applicable in networking problems through the concepts of effective learning, turning it into a useful and reliable way to deal with complex data-driven networking problems; (ii) to boost the adoption of AI/ML at the large scale by learning from the Internet-paradigm itself, conceiving novel distributed and hierarchical learning approaches mimicking the distributed topological principles and operation of the Internet itself; and (iii) to exploit the softwarization and distribution of networks to conceive AI/ML-defined Networks (AIDN), relying on the distributed generation and re-usage of knowledge through novel Knowledge Delivery Networks (KDNs).

* 5th IEEE/IFIP International Workshop on Analytics for Network and Service Management (AnNet 2020)

Via

Access Paper or Ask Questions