Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Francesco Zola

Synthetic Pattern Generation and Detection of Financial Activities using Graph Autoencoders

Jan 29, 2026

Francesco Zola, Lucia Muñoz, Andrea Venturi, Amaia Gil

Abstract:Illicit financial activities such as money laundering often manifest through recurrent topological patterns in transaction networks. Detecting these patterns automatically remains challenging due to the scarcity of labeled real-world data and strict privacy constraints. To address this, we investigate whether Graph Autoencoders (GAEs) can effectively learn and distinguish topological patterns that mimic money laundering operations when trained on synthetic data. The analysis consists of two phases: (i) data generation, where synthetic samples are created for seven well-known illicit activity patterns using parametrized generators that preserve structural consistency while introducing realistic variability; and (ii) model training and validation, where separate GAEs are trained on each pattern without explicit labels, relying solely on reconstruction error as an indicator of learned structure. We compare three GAE implementations based on three distinct convolutional layers: Graph Convolutional (GAE-GCN), GraphSAGE (GAE-SAGE), and Graph Attention Network (GAE-GAT). Experimental results show that GAE-GCN achieves the most consistent reconstruction performance across patterns, while GAE-SAGE and GAE-GAT exhibit competitive results only in few specific patterns. These findings suggest that graph-based representation learning on synthetic data provides a viable path toward developing AI-driven tools for detecting illicit behaviors, overcoming the limitations of financial datasets.

* Accept to The 7th International Workshop on Statistical Methods and Artificial Intelligence (IWSMAI'26)

Via

Access Paper or Ask Questions

A Graph Machine Learning Approach for Detecting Topological Patterns in Transactional Graphs

Sep 16, 2025

Francesco Zola, Jon Ander Medina, Andrea Venturi, Amaia Gil, Raul Orduna

Abstract:The rise of digital ecosystems has exposed the financial sector to evolving abuse and criminal tactics that share operational knowledge and techniques both within and across different environments (fiat-based, crypto-assets, etc.). Traditional rule-based systems lack the adaptability needed to detect sophisticated or coordinated criminal behaviors (patterns), highlighting the need for strategies that analyze actors' interactions to uncover suspicious activities and extract their modus operandi. For this reason, in this work, we propose an approach that integrates graph machine learning and network analysis to improve the detection of well-known topological patterns within transactional graphs. However, a key challenge lies in the limitations of traditional financial datasets, which often provide sparse, unlabeled information that is difficult to use for graph-based pattern analysis. Therefore, we firstly propose a four-step preprocessing framework that involves (i) extracting graph structures, (ii) considering data temporality to manage large node sets, (iii) detecting communities within, and (iv) applying automatic labeling strategies to generate weak ground-truth labels. Then, once the data is processed, Graph Autoencoders are implemented to distinguish among the well-known topological patterns. Specifically, three different GAE variants are implemented and compared in this analysis. Preliminary results show that this pattern-focused, topology-driven method is effective for detecting complex financial crime schemes, offering a promising alternative to conventional rule-based detection systems.

* Paper accepted @ Workshop on AI for Financial Crime Fight (AI4FCF @ ICDM 2025)

Via

Access Paper or Ask Questions

Enhancing Law Enforcement Training: A Gamified Approach to Detecting Terrorism Financing

Mar 20, 2024

Francesco Zola, Lander Segurola, Erin King, Martin Mullins, Raul Orduna

Figure 1 for Enhancing Law Enforcement Training: A Gamified Approach to Detecting Terrorism Financing

Figure 2 for Enhancing Law Enforcement Training: A Gamified Approach to Detecting Terrorism Financing

Figure 3 for Enhancing Law Enforcement Training: A Gamified Approach to Detecting Terrorism Financing

Figure 4 for Enhancing Law Enforcement Training: A Gamified Approach to Detecting Terrorism Financing

Abstract:Tools for fighting cyber-criminal activities using new technologies are promoted and deployed every day. However, too often, they are unnecessarily complex and hard to use, requiring deep domain and technical knowledge. These characteristics often limit the engagement of law enforcement and end-users in these technologies that, despite their potential, remain misunderstood. For this reason, in this study, we describe our experience in combining learning and training methods and the potential benefits of gamification to enhance technology transfer and increase adult learning. In fact, in this case, participants are experienced practitioners in professions/industries that are exposed to terrorism financing (such as Law Enforcement Officers, Financial Investigation Officers, private investigators, etc.) We define training activities on different levels for increasing the exchange of information about new trends and criminal modus operandi among and within law enforcement agencies, intensifying cross-border cooperation and supporting efforts to combat and prevent terrorism funding activities. On the other hand, a game (hackathon) is designed to address realistic challenges related to the dark net, crypto assets, new payment systems and dark web marketplaces that could be used for terrorist activities. The entire methodology was evaluated using quizzes, contest results, and engagement metrics. In particular, training events show about 60% of participants complete the 11-week training course, while the Hackathon results, gathered in two pilot studies (Madrid and The Hague), show increasing expertise among the participants (progression in the achieved points on average). At the same time, more than 70% of participants positively evaluate the use of the gamification approach, and more than 85% of them consider the implemented Use Cases suitable for their investigations.

* International Journal of Police Science & Management, Sage, 0(0), [2024]

Via

Access Paper or Ask Questions

NeuralSentinel: Safeguarding Neural Network Reliability and Trustworthiness

Feb 12, 2024

Xabier Echeberria-Barrio, Mikel Gorricho, Selene Valencia, Francesco Zola

Abstract:The usage of Artificial Intelligence (AI) systems has increased exponentially, thanks to their ability to reduce the amount of data to be analyzed, the user efforts and preserving a high rate of accuracy. However, introducing this new element in the loop has converted them into attacked points that can compromise the reliability of the systems. This new scenario has raised crucial challenges regarding the reliability and trustworthiness of the AI models, as well as about the uncertainties in their response decisions, becoming even more crucial when applied in critical domains such as healthcare, chemical, electrical plants, etc. To contain these issues, in this paper, we present NeuralSentinel (NS), a tool able to validate the reliability and trustworthiness of AI models. This tool combines attack and defence strategies and explainability concepts to stress an AI model and help non-expert staff increase their confidence in this new system by understanding the model decisions. NS provide a simple and easy-to-use interface for helping humans in the loop dealing with all the needed information. This tool was deployed and used in a Hackathon event to evaluate the reliability of a skin cancer image detector. During the event, experts and non-experts attacked and defended the detector, learning which factors were the most important for model misclassification and which techniques were the most efficient. The event was also used to detect NS's limitations and gather feedback for further improvements.

* CS and IT Conference Proceedings, CS and IT Conference Proceedings, 2024

Via

Access Paper or Ask Questions

Conditional Generative Adversarial Network for keystroke presentation attack

Dec 16, 2022

Idoia Eizaguirre-Peral, Lander Segurola-Gil, Francesco Zola

Abstract:Cybersecurity is a crucial step in data protection to ensure user security and personal data privacy. In this sense, many companies have started to control and restrict access to their data using authentication systems. However, these traditional authentication methods, are not enough for ensuring data protection, and for this reason, behavioral biometrics have gained importance. Despite their promising results and the wide range of applications, biometric systems have shown to be vulnerable to malicious attacks, such as Presentation Attacks. For this reason, in this work, we propose to study a new approach aiming to deploy a presentation attack towards a keystroke authentication system. Our idea is to use Conditional Generative Adversarial Networks (cGAN) for generating synthetic keystroke data that can be used for impersonating an authorized user. These synthetic data are generated following two different real use cases, one in which the order of the typed words is known (ordered dynamic) and the other in which this order is unknown (no-ordered dynamic). Finally, both keystroke dynamics (ordered and no-ordered) are validated using an external keystroke authentication system. Results indicate that the cGAN can effectively generate keystroke dynamics patterns that can be used for deceiving keystroke authentication systems.

* 2022 Spanish Cybersecurity Research Conference (JNIC 2022)

Via

Access Paper or Ask Questions

Verification system based on long-range iris and Graph Siamese Neural Networks

Jul 28, 2022

Francesco Zola, Jose Alvaro Fernandez-Carrasco, Jan Lukas Bruse, Mikel Galar, Zeno Geradts

Figure 1 for Verification system based on long-range iris and Graph Siamese Neural Networks

Figure 2 for Verification system based on long-range iris and Graph Siamese Neural Networks

Figure 3 for Verification system based on long-range iris and Graph Siamese Neural Networks

Figure 4 for Verification system based on long-range iris and Graph Siamese Neural Networks

Abstract:Biometric systems represent valid solutions in tasks like user authentication and verification, since they are able to analyze physical and behavioural features with high precision. However, especially when physical biometrics are used, as is the case of iris recognition, they require specific hardware such as retina scanners, sensors, or HD cameras to achieve relevant results. At the same time, they require the users to be very close to the camera to extract high-resolution information. For this reason, in this work, we propose a novel approach that uses long-range (LR) distance images for implementing an iris verification system. More specifically, we present a novel methodology for converting LR iris images into graphs and then use Graph Siamese Neural Networks (GSNN) to predict whether two graphs belong to the same person. In this study, we not only describe this methodology but also evaluate how the spectral components of these images can be used for improving the graph extraction and the final classification task. Results demonstrate the suitability of this approach, encouraging the community to explore graph application in biometric systems.

* Accepted to 2022 3rd Symposium on Pattern Recognition and Applications

Via

Access Paper or Ask Questions

NBcoded: network attack classifiers based on Encoder and Naive Bayes model for resource limited devices

Sep 15, 2021

Lander Segurola-Gil, Francesco Zola, Xabier Echeberria-Barrio, Raul Orduna-Urrutia

Figure 1 for NBcoded: network attack classifiers based on Encoder and Naive Bayes model for resource limited devices

Figure 2 for NBcoded: network attack classifiers based on Encoder and Naive Bayes model for resource limited devices

Figure 3 for NBcoded: network attack classifiers based on Encoder and Naive Bayes model for resource limited devices

Figure 4 for NBcoded: network attack classifiers based on Encoder and Naive Bayes model for resource limited devices

Abstract:In the recent years, cybersecurity has gained high relevance, converting the detection of attacks or intrusions into a key task. In fact, a small breach in a system, application, or network, can cause huge damage for the companies. However, when this attack detection encounters the Artificial Intelligence paradigm, it can be addressed using high-quality classifiers which often need high resource demands in terms of computation or memory usage. This situation has a high impact when the attack classifiers need to be used with limited resourced devices or without overloading the performance of the devices, as it happens for example in IoT devices, or in industrial systems. For overcoming this issue, NBcoded, a novel light attack classification tool is proposed in this work. NBcoded works in a pipeline combining the removal of noisy data properties of the encoders with the low resources and timing consuming obtained by the Naive Bayes classifier. This work compares three different NBcoded implementations based on three different Naive Bayes likelihood distribution assumptions (Gaussian, Complement and Bernoulli). Then, the best NBcoded is compared with state of the art classifiers like Multilayer Perceptron and Random Forest. Our implementation shows to be the best model reducing the impact of training time and disk usage, even if it is outperformed by the other two in terms of Accuracy and F1-score (~ 2%).

* It will be published in "Communications in Computer and Information Science" and presented in the 3rd Workshop of Machine Learning for Cybersecurity (MLCS)

Via

Access Paper or Ask Questions

Temporal graph-based approach for behavioural entity classification

May 11, 2021

Francesco Zola, Lander Segurola, Jan Lukas Bruse, Mikel Galar Idoate

Figure 1 for Temporal graph-based approach for behavioural entity classification

Figure 2 for Temporal graph-based approach for behavioural entity classification

Figure 3 for Temporal graph-based approach for behavioural entity classification

Figure 4 for Temporal graph-based approach for behavioural entity classification

Abstract:Graph-based analyses have gained a lot of relevance in the past years due to their high potential in describing complex systems by detailing the actors involved, their relations and their behaviours. Nevertheless, in scenarios where these aspects are evolving over time, it is not easy to extract valuable information or to characterize correctly all the actors. In this study, a two phased approach for exploiting the potential of graph structures in the cybersecurity domain is presented. The main idea is to convert a network classification problem into a graph-based behavioural one. We extract these graph structures that can represent the evolution of both normal and attack entities and apply a temporal dissection approach in order to highlight their micro-dynamics. Further, three clustering techniques are applied to the normal entities in order to aggregate similar behaviours, mitigate the imbalance problem and reduce noisy data. Our approach suggests the implementation of two promising deep learning paradigms for entity classification based on Graph Convolutional Networks.

Via

Access Paper or Ask Questions

Generative Adversarial Networks for Bitcoin Data Augmentation

May 27, 2020

Francesco Zola, Jan Lukas Bruse, Xabier Etxeberria Barrio, Mikel Galar, Raul Orduna Urrutia

Figure 1 for Generative Adversarial Networks for Bitcoin Data Augmentation

Figure 2 for Generative Adversarial Networks for Bitcoin Data Augmentation

Figure 3 for Generative Adversarial Networks for Bitcoin Data Augmentation

Figure 4 for Generative Adversarial Networks for Bitcoin Data Augmentation

Abstract:In Bitcoin entity classification, results are strongly conditioned by the ground-truth dataset, especially when applying supervised machine learning approaches. However, these ground-truth datasets are frequently affected by significant class imbalance as generally they contain much more information regarding legal services (Exchange, Gambling), than regarding services that may be related to illicit activities (Mixer, Service). Class imbalance increases the complexity of applying machine learning techniques and reduces the quality of classification results, especially for underrepresented, but critical classes. In this paper, we propose to address this problem by using Generative Adversarial Networks (GANs) for Bitcoin data augmentation as GANs recently have shown promising results in the domain of image classification. However, there is no "one-fits-all" GAN solution that works for every scenario. In fact, setting GAN training parameters is non-trivial and heavily affects the quality of the generated synthetic data. We therefore evaluate how GAN parameters such as the optimization function, the size of the dataset and the chosen batch size affect GAN implementation for one underrepresented entity class (Mining Pool) and demonstrate how a "good" GAN configuration can be obtained that achieves high similarity between synthetically generated and real Bitcoin address data. To the best of our knowledge, this is the first study presenting GANs as a valid tool for generating synthetic address data for data augmentation in Bitcoin entity classification.

* 8 pages, 5 figures, 4 tables

Via

Access Paper or Ask Questions

Cascading Machine Learning to Attack Bitcoin Anonymity

Oct 15, 2019

Francesco Zola, Maria Eguimendia, Jan Lukas Bruse, Raul Orduna Urrutia

Figure 1 for Cascading Machine Learning to Attack Bitcoin Anonymity

Figure 2 for Cascading Machine Learning to Attack Bitcoin Anonymity

Figure 3 for Cascading Machine Learning to Attack Bitcoin Anonymity

Figure 4 for Cascading Machine Learning to Attack Bitcoin Anonymity

Abstract:Bitcoin is a decentralized, pseudonymous cryptocurrency that is one of the most used digital assets to date. Its unregulated nature and inherent anonymity of users have led to a dramatic increase in its use for illicit activities. This calls for the development of novel methods capable of characterizing different entities in the Bitcoin network. In this paper, a method to attack Bitcoin anonymity is presented, leveraging a novel cascading machine learning approach that requires only a few features directly extracted from Bitcoin blockchain data. Cascading, used to enrich entities information with data from previous classifications, led to considerably improved multi-class classification performance with excellent values of Precision close to 1.0 for each considered class. Final models were implemented and compared using different machine learning models and showed significantly higher accuracy compared to their baseline implementation. Our approach can contribute to the development of effective tools for Bitcoin entity characterization, which may assist in uncovering illegal activities.

* 15 pages,7 figures, 4 tables, presented in 2019 IEEE International Conference on Blockchain (Blockchain)

Via

Access Paper or Ask Questions