Abstract:Over the past few decades, electroencephalography (EEG) monitoring has become a pivotal tool for diagnosing neurological disorders, particularly for detecting seizures. Epilepsy, one of the most prevalent neurological diseases worldwide, affects approximately the 1 \% of the population. These patients face significant risks, underscoring the need for reliable, continuous seizure monitoring in daily life. Most of the techniques discussed in the literature rely on supervised Machine Learning (ML) methods. However, the challenge of accurately labeling variations in epileptic EEG waveforms complicates the use of these approaches. Additionally, the rarity of ictal events introduces an high imbalancing within the data, which could lead to poor prediction performance in supervised learning approaches. Instead, a semi-supervised approach allows to train the model only on data not containing seizures, thus avoiding the issues related to the data imbalancing. This work proposes a semi-supervised approach for detecting epileptic seizures from EEG data, utilizing a novel Deep Learning-based method called SincVAE. This proposal incorporates the learning of an ad-hoc array of bandpass filter as a first layer of a Variational Autoencoder (VAE), potentially eliminating the preprocessing stage where informative band frequencies are identified and isolated. Results indicate that SincVAE improves seizure detection in EEG data and is capable of identifying early seizures during the preictal stage as well as monitoring patients throughout the postictal stage.
Abstract:Modern Artificial Intelligence (AI) systems, especially Deep Learning (DL) models, poses challenges in understanding their inner workings by AI researchers. eXplainable Artificial Intelligence (XAI) inspects internal mechanisms of AI models providing explanations about their decisions. While current XAI research predominantly concentrates on explaining AI systems, there is a growing interest in using XAI techniques to automatically improve the performance of AI systems themselves. This paper proposes a general framework for automatically improving the performance of pre-trained DL classifiers using XAI methods, avoiding the computational overhead associated with retraining complex models from scratch. In particular, we outline the possibility of two different learning strategies for implementing this architecture, which we will call auto-encoder-based and encoder-decoder-based, and discuss their key aspects.
Abstract:Machine Learning (ML) has revolutionized various domains, offering predictive capabilities in several areas. However, with the increasing accessibility of ML tools, many practitioners, lacking deep ML expertise, adopt a "push the button" approach, utilizing user-friendly interfaces without a thorough understanding of underlying algorithms. While this approach provides convenience, it raises concerns about the reliability of outcomes, leading to challenges such as incorrect performance evaluation. This paper addresses a critical issue in ML, known as data leakage, where unintended information contaminates the training data, impacting model performance evaluation. Users, due to a lack of understanding, may inadvertently overlook crucial steps, leading to optimistic performance estimates that may not hold in real-world scenarios. The discrepancy between evaluated and actual performance on new data is a significant concern. In particular, this paper categorizes data leakage in ML, discussing how certain conditions can propagate through the ML workflow. Furthermore, it explores the connection between data leakage and the specific task being addressed, investigates its occurrence in Transfer Learning, and compares standard inductive ML with transductive ML frameworks. The conclusion summarizes key findings, emphasizing the importance of addressing data leakage for robust and reliable ML applications.
Abstract:In the context of classification problems, Deep Learning (DL) approaches represent state of art. Many DL approaches are based on variations of standard multi-layer feed-forward neural networks. These are also referred to as deep networks. The basic idea is that each hidden neural layer accomplishes a data transformation which is expected to make the data representation "somewhat more linearly separable" than the previous one to obtain a final data representation which is as linearly separable as possible. However, determining the appropriate neural network parameters that can perform these transformations is a critical problem. In this paper, we investigate the impact on deep network classifier performances of a training approach favouring solutions where data representations at the hidden layers have a higher degree of linear separability between the classes with respect to standard methods. To this aim, we propose a neural network architecture which induces an error function involving the outputs of all the network layers. Although similar approaches have already been partially discussed in the past literature, here we propose a new architecture with a novel error function and an extensive experimental analysis. This experimental analysis was made in the context of image classification tasks considering four widely used datasets. The results show that our approach improves the accuracy on the test set in all the considered cases.
Abstract:Explainable Artificial Intelligence (XAI) aims to provide insights into the decision-making process of AI models, allowing users to understand their results beyond their decisions. A significant goal of XAI is to improve the performance of AI models by providing explanations for their decision-making processes. However, most XAI literature focuses on how to explain an AI system, while less attention has been given to how XAI methods can be exploited to improve an AI system. In this work, a set of well-known XAI methods typically used with Machine Learning (ML) classification tasks are investigated to verify if they can be exploited, not just to provide explanations but also to improve the performance of the model itself. To this aim, two strategies to use the explanation to improve a classification system are reported and empirically evaluated on three datasets: Fashion-MNIST, CIFAR10, and STL10. Results suggest that explanations built by Integrated Gradients highlight input features that can be effectively used to improve classification performance.
Abstract:An interesting case of the well-known Dataset Shift Problem is the classification of Electroencephalogram (EEG) signals in the context of Brain-Computer Interface (BCI). The non-stationarity of EEG signals can lead to poor generalisation performance in BCI classification systems used in different sessions, also from the same subject. In this paper, we start from the hypothesis that the Dataset Shift problem can be alleviated by exploiting suitable eXplainable Artificial Intelligence (XAI) methods to locate and transform the relevant characteristics of the input for the goal of classification. In particular, we focus on an experimental analysis of explanations produced by several XAI methods on an ML system trained on a typical EEG dataset for emotion recognition. Results show that many relevant components found by XAI methods are shared across the sessions and can be used to build a system able to generalise better. However, relevant components of the input signal also appear to be highly dependent on the input itself.
Abstract:In the Machine Learning (ML) literature, a well-known problem is the Dataset Shift problem where, differently from the ML standard hypothesis, the data in the training and test sets can follow different probability distributions, leading ML systems toward poor generalisation performances. This problem is intensely felt in the Brain-Computer Interface (BCI) context, where bio-signals as Electroencephalographic (EEG) are often used. In fact, EEG signals are highly non-stationary both over time and between different subjects. To overcome this problem, several proposed solutions are based on recent transfer learning approaches such as Domain Adaption (DA). In several cases, however, the actual causes of the improvements remain ambiguous. This paper focuses on the impact of data normalisation, or standardisation strategies applied together with DA methods. In particular, using \textit{SEED}, \textit{DEAP}, and \textit{BCI Competition IV 2a} EEG datasets, we experimentally evaluated the impact of different normalization strategies applied with and without several well-known DA methods, comparing the obtained performances. It results that the choice of the normalisation strategy plays a key role on the classifier performances in DA scenarios, and interestingly, in several cases, the use of only an appropriate normalisation schema outperforms the DA technique.
Abstract:Nowadays, it is growing interest to make Machine Learning (ML) systems more understandable and trusting to general users. Thus, generating explanations for ML system behaviours that are understandable to human beings is a central scientific and technological issue addressed by the rapidly growing research area of eXplainable Artificial Intelligence (XAI). Recently, it is becoming more and more evident that new directions to create better explanations should take into account what a good explanation is to a human user, and consequently, develop XAI solutions able to provide user-centred explanations. This paper suggests taking advantage of developing an XAI general approach that allows producing explanations for an ML system behaviour in terms of different and user-selected input features, i.e., explanations composed of input properties that the human user can select according to his background knowledge and goals. To this end, we propose an XAI general approach which is able: 1) to construct explanations in terms of input features that represent more salient and understandable input properties for a user, which we call here Middle-Level input Features (MLFs), 2) to be applied to different types of MLFs. We experimentally tested our approach on two different datasets and using three different types of MLFs. The results seem encouraging.
Abstract:Over the last few years, we have seen increasing data generated from non-Euclidean domains, which are usually represented as graphs with complex relationships, and Graph Neural Networks (GNN) have gained a high interest because of their potential in processing graph-structured data. In particular, there is a strong interest in exploring the possibilities in performing convolution on graphs using an extension of the GNN architecture, generally referred to as Graph Convolutional Neural Networks (GCNN). Convolution on graphs has been achieved mainly in two forms: spectral and spatial convolutions. Due to the higher flexibility in exploring and exploiting the graph structure of data, recently, there is an increasing interest in investigating the possibilities that the spatial approach can offer. The idea of finding a way to adapt the network behaviour to the inputs they process to maximize the total performances has aroused much interest in the neural networks literature over the years. This paper presents a novel method to adapt the behaviour of a GCNN to the input proposing two ways to perform spatial convolution on graphs using input-based filters which are dynamically generated. Our model also investigates the problem of discovering and refining relations among nodes. The experimental assessment confirms the capabilities of the proposed approach, which achieves satisfying results using simple architectures with a low number of filters.
Abstract:This work proposes a novel general framework, in the context of eXplainable Artificial Intelligence (XAI), to construct explanations for the behaviour of Machine Learning (ML) models in terms of middle-level features. One can isolate two different ways to provide explanations in the context of XAI: low and middle-level explanations. Middle-level explanations have been introduced for alleviating some deficiencies of low-level explanations such as, in the context of image classification, the fact that human users are left with a significant interpretive burden: starting from low-level explanations, one has to identify properties of the overall input that are perceptually salient for the human visual system. However, a general approach to correctly evaluate the elements of middle-level explanations with respect ML model responses has never been proposed in the literature.