Abstract:Tiny Machine Learning (TinyML) is a branch of Machine Learning (ML) that constitutes a bridge between the ML world and the embedded system ecosystem (i.e., Internet of Things devices, embedded devices, and edge computing units), enabling the execution of ML algorithms on devices constrained in terms of memory, computational capabilities, and power consumption. Video Streaming Analysis (VSA), one of the most interesting tasks of TinyML, consists in scanning a sequence of frames in a streaming manner, with the goal of identifying interesting patterns. Given the strict constraints of these tiny devices, all the current solutions rely on performing a frame-by-frame analysis, hence not exploiting the temporal component in the stream of data. In this paper, we present StreamTinyNet, the first TinyML architecture to perform multiple-frame VSA, enabling a variety of use cases that requires spatial-temporal analysis that were previously impossible to be carried out at a TinyML level. Experimental results on public-available datasets show the effectiveness and efficiency of the proposed solution. Finally, StreamTinyNet has been ported and tested on the Arduino Nicla Vision, showing the feasibility of what proposed.
Abstract:Quantization has become increasingly pivotal in addressing the steadily increasing computational and memory requirements of Deep Neural Networks (DNNs). By reducing the number of bits used to represent weights and activations (typically from 32-bit floating-point to 16-bit or 8-bit integers), quantization reduces the memory footprint, energy consumption, and execution time of DNN models. However, traditional quantization methods typically focus on the inference of DNNs, while the training process still relies on floating-point operations. To date, only one work in the literature has addressed integer-only training for Multi-Layer Perceptron (MLP) architectures. This work introduces NITRO-D, a new framework for training arbitrarily deep integer-only Convolutional Neural Networks (CNNs) that operate entirely< in the integer-only domain for both training and inference. NITRO-D is the first framework in the literature enabling the training of integer-only CNNs without the need to introduce a quantization scheme. Specifically, NITRO-D introduces a novel architecture integrating multiple integer local-loss blocks, which include the proposed NITRO Scaling Layer and the NITRO-ReLU activation function. Additionally, it introduces a novel integer-only learning algorithm derived from Local Error Signals (LES), utilizing IntegerSGD, an optimizer specifically designed to operate in an integer-only context. NITRO-D is implemented in an open-source Python library. Extensive experimental evaluations demonstrate its effectiveness across several state-of-the-art image recognition datasets. Results show significant performance improvements from 2.47% to 5.96% for integer-only MLP architectures over the state-of-the-art solution, and the capability of training integer-only CNN architectures with minimal accuracy degradation from -0.15% to -4.22% compared to floating-point LES.
Abstract:TinyML is a novel area of machine learning that gained huge momentum in the last few years thanks to the ability to execute machine learning algorithms on tiny devices (such as Internet-of-Things or embedded systems). Interestingly, research in this area focused on the efficient execution of the inference phase of TinyML models on tiny devices, while very few solutions for on-device learning of TinyML models are available in the literature due to the relevant overhead introduced by the learning algorithms. The aim of this paper is to introduce a new type of adaptive TinyML solution that can be used in tasks, such as the presented \textit{Tiny Speaker Verification} (TinySV), that require to be tackled with an on-device learning algorithm. Achieving this goal required (i) reducing the memory and computational demand of TinyML learning algorithms, and (ii) designing a TinyML learning algorithm operating with few and possibly unlabelled training data. The proposed TinySV solution relies on a two-layer hierarchical TinyML solution comprising Keyword Spotting and Adaptive Speaker Verification module. We evaluated the effectiveness and efficiency of the proposed TinySV solution on a dataset collected expressly for the task and tested the proposed solution on a real-world IoT device (Infineon PSoC 62S2 Wi-Fi BT Pioneer Kit).
Abstract:Accurate classification of medical images is essential for modern diagnostics. Deep learning advancements led clinicians to increasingly use sophisticated models to make faster and more accurate decisions, sometimes replacing human judgment. However, model development is costly and repetitive. Neural Architecture Search (NAS) provides solutions by automating the design of deep learning architectures. This paper presents ZO-DARTS+, a differentiable NAS algorithm that improves search efficiency through a novel method of generating sparse probabilities by bi-level optimization. Experiments on five public medical datasets show that ZO-DARTS+ matches the accuracy of state-of-the-art solutions while reducing search times by up to three times.
Abstract:Neural Architecture Search (NAS) paves the way for the automatic definition of Neural Network (NN) architectures, attracting increasing research attention and offering solutions in various scenarios. This study introduces a novel NAS solution, called Flat Neural Architecture Search (FlatNAS), which explores the interplay between a novel figure of merit based on robustness to weight perturbations and single NN optimization with Sharpness-Aware Minimization (SAM). FlatNAS is the first work in the literature to systematically explore flat regions in the loss landscape of NNs in a NAS procedure, while jointly optimizing their performance on in-distribution data, their out-of-distribution (OOD) robustness, and constraining the number of parameters in their architecture. Differently from current studies primarily concentrating on OOD algorithms, FlatNAS successfully evaluates the impact of NN architectures on OOD robustness, a crucial aspect in real-world applications of machine and deep learning. FlatNAS achieves a good trade-off between performance, OOD generalization, and the number of parameters, by using only in-distribution data in the NAS exploration. The OOD robustness of the NAS-designed models is evaluated by focusing on robustness to input data corruptions, using popular benchmark datasets in the literature.
Abstract:Early Exit Neural Networks (EENNs) endow astandard Deep Neural Network (DNN) with Early Exit Classifiers (EECs), to provide predictions at intermediate points of the processing when enough confidence in classification is achieved. This leads to many benefits in terms of effectiveness and efficiency. Currently, the design of EENNs is carried out manually by experts, a complex and time-consuming task that requires accounting for many aspects, including the correct placement, the thresholding, and the computational overhead of the EECs. For this reason, the research is exploring the use of Neural Architecture Search (NAS) to automatize the design of EENNs. Currently, few comprehensive NAS solutions for EENNs have been proposed in the literature, and a fully automated, joint design strategy taking into consideration both the backbone and the EECs remains an open problem. To this end, this work presents Neural Architecture Search for Hardware Constrained Early Exit Neural Networks (NACHOS), the first NAS framework for the design of optimal EENNs satisfying constraints on the accuracy and the number of Multiply and Accumulate (MAC) operations performed by the EENNs at inference time. In particular, this provides the joint design of backbone and EECs to select a set of admissible (i.e., respecting the constraints) Pareto Optimal Solutions in terms of best tradeoff between the accuracy and number of MACs. The results show that the models designed by NACHOS are competitive with the state-of-the-art EENNs. Additionally, this work investigates the effectiveness of two novel regularization terms designed for the optimization of the auxiliary classifiers of the EENN
Abstract:In real-world applications, the process generating the data might suffer from nonstationary effects (e.g., due to seasonality, faults affecting sensors or actuators, and changes in the users' behaviour). These changes, often called concept drift, might induce severe (potentially catastrophic) impacts on trained learning models that become obsolete over time, and inadequate to solve the task at hand. Learning in presence of concept drift aims at designing machine and deep learning models that are able to track and adapt to concept drift. Typically, techniques to handle concept drift are either active or passive, and traditionally, these have been considered to be mutually exclusive. Active techniques use an explicit drift detection mechanism, and re-train the learning algorithm when concept drift is detected. Passive techniques use an implicit method to deal with drift, and continually update the model using incremental learning. Differently from what present in the literature, we propose a hybrid alternative which merges the two approaches, hence, leveraging on their advantages. The proposed method called Hybrid-Adaptive REBAlancing (HAREBA) significantly outperforms strong baselines and state-of-the-art methods in terms of learning quality and speed; we experiment how it is effective under severe class imbalance levels too.
Abstract:Tiny Machine Learning (TML) is a new research area whose goal is to design machine and deep learning techniques able to operate in Embedded Systems and IoT units, hence satisfying the severe technological constraints on memory, computation, and energy characterizing these pervasive devices. Interestingly, the related literature mainly focused on reducing the computational and memory demand of the inference phase of machine and deep learning models. At the same time, the training is typically assumed to be carried out in Cloud or edge computing systems (due to the larger memory and computational requirements). This assumption results in TML solutions that might become obsolete when the process generating the data is affected by concept drift (e.g., due to periodicity or seasonality effect, faults or malfunctioning affecting sensors or actuators, or changes in the users' behavior), a common situation in real-world application scenarios. For the first time in the literature, this paper introduces a Tiny Machine Learning for Concept Drift (TML-CD) solution based on deep learning feature extractors and a k-nearest neighbors classifier integrating a hybrid adaptation module able to deal with concept drift affecting the data-generating process. This adaptation module continuously updates (in a passive way) the knowledge base of TML-CD and, at the same time, employs a Change Detection Test to inspect for changes (in an active way) to quickly adapt to concept drift by removing the obsolete knowledge. Experimental results on both image and audio benchmarks show the effectiveness of the proposed solution, whilst the porting of TML-CD on three off-the-shelf micro-controller units shows the feasibility of what is proposed in real-world pervasive systems.
Abstract:Extracting patterns and useful information from Natural Language datasets is a challenging task, especially when dealing with data written in a language different from English, like Italian. Machine and Deep Learning, together with Natural Language Processing (NLP) techniques have widely spread and improved lately, providing a plethora of useful methods to address both Supervised and Unsupervised problems on textual information. We propose RECKONition, a NLP-based system for Industrial Accidents at Work Prevention. RECKONition, which is meant to provide Natural Language Understanding, Clustering and Inference, is the result of a joint partnership with the Italian National Institute for Insurance against Accidents at Work (INAIL). The obtained results showed the ability to process textual data written in Italian describing industrial accidents dynamics and consequences.
Abstract:In most of the transfer learning approaches to reinforcement learning (RL) the distribution over the tasks is assumed to be stationary. Therefore, the target and source tasks are i.i.d. samples of the same distribution. In the context of this work, we consider the problem of transferring value functions through a variational method when the distribution that generates the tasks is time-variant, proposing a solution that leverages this temporal structure inherent in the task generating process. Furthermore, by means of a finite-sample analysis, the previously mentioned solution is theoretically compared to its time-invariant version. Finally, we will provide an experimental evaluation of the proposed technique with three distinct temporal dynamics in three different RL environments.