Abstract:Wearable systems for the long-term monitoring of cardiovascular diseases are becoming widespread and valuable assets in diagnosis and therapy. A promising approach for real-time analysis of the electrocardiographic (ECG) signal and the detection of heart conditions, such as arrhythmia, is represented by the transformer machine learning model. Transformers are powerful models for the classification of time series, although efficient implementation in the wearable domain raises significant design challenges, to combine adequate accuracy and a suitable complexity. In this work, we present a tiny transformer model for the analysis of the ECG signal, requiring only 6k parameters and reaching 98.97% accuracy in the recognition of the 5 most common arrhythmia classes from the MIT-BIH Arrhythmia database, assessed considering 8-bit integer inference as required for efficient execution on low-power microcontroller-based devices. We explored an augmentation-based training approach for improving the robustness against electrode motion artifacts noise, resulting in a worst-case post-deployment performance assessment of 98.36% accuracy. Suitability for wearable monitoring solutions is finally demonstrated through efficient deployment on the parallel ultra-low-power GAP9 processor, where inference execution requires 4.28ms and 0.09mJ.
Abstract:The new generation of wireless technologies, fitness trackers, and devices with embedded sensors can have a big impact on healthcare systems and quality of life. Among the most crucial aspects to consider in these devices are the accuracy of the data produced and power consumption. Many of the events that can be monitored, while apparently simple, may not be easily detectable and recognizable by devices equipped with embedded sensors, especially on devices with low computing capabilities. It is well known that deep learning reduces the study of features that contribute to the recognition of the different target classes. In this work, we present a portable and battery-powered microcontroller-based device applicable to a wobble board. Wobble boards are low-cost equipment that can be used for sensorimotor training to avoid ankle injuries or as part of the rehabilitation process after an injury. The exercise recognition process was implemented through the use of cognitive techniques based on deep learning. To reduce power consumption, we add an adaptivity layer that dynamically manages the device's hardware and software configuration to adapt it to the required operating mode at runtime. Our experimental results show that adjusting the node configuration to the workload at runtime can save up to 60% of the power consumed. On a custom dataset, our optimized and quantized neural network achieves an accuracy value greater than 97% for detecting some specific physical exercises on a wobble board.
Abstract:The Internet of Medical Things (IoMT) paradigm is becoming mainstream in multiple clinical trials and healthcare procedures. It relies on novel very accurate and compact sensing devices and communication infrastructures, opening previously unmatched possibilities of implementing data collection and continuous patient monitoring. Nevertheless, to fully exploit the potential of this technology, some steps forwards are needed. First, the edge-computing paradigm must be added to the picture. A certain level of near-sensor processing has to be enabled, to improve the scalability, portability, reliability, responsiveness of the IoMT nodes. Second, novel, increasingly accurate, data analysis algorithms, such as those based on artificial intelligence and Deep Learning, must be exploited. To reach these objectives, designers, programmers of IoMT nodes, have to face challenging optimization tasks, in order to execute fairly complex computing tasks on low-power wearable and portable processing systems, with tight power and battery lifetime budgets. In this work, we explore the implementation of cognitive data analysis algorithm on resource-constrained computing platforms. To minimize power consumption, we add an adaptivity layer that dynamically manages the hardware and software configuration of the device to adapt it at runtime to the required operating mode. We have assessed our approach on a use-case using a convolutional neural network to classify electrocardiogram (ECG) traces on a low-power microcontroller. Our experimental results show that adapting the node setup to the workload at runtime can save up to 50% power consumption and a quantized neural network reaches an accuracy value higher than 98% for arrhythmia disorders detection on MIT-BIH Arrhythmia dataset.
Abstract:This volume contains the papers accepted at the first DATE Friday Workshop on System-level Design Methods for Deep Learning on Heterogeneous Architectures (SLOHA 2021), held virtually on February 5, 2021. SLOHA 2021 was co-located with the Conference on Design, Automation and Test in Europe (DATE).
Abstract:Deep convolutional neural networks (CNNs) obtain outstanding results in tasks that require human-level understanding of data, like image or speech recognition. However, their computational load is significant, motivating the development of CNN-specialized accelerators. This work presents NEURAghe, a flexible and efficient hardware/software solution for the acceleration of CNNs on Zynq SoCs. NEURAghe leverages the synergistic usage of Zynq ARM cores and of a powerful and flexible Convolution-Specific Processor deployed on the reconfigurable logic. The Convolution-Specific Processor embeds both a convolution engine and a programmable soft core, releasing the ARM processors from most of the supervision duties and allowing the accelerator to be controlled by software at an ultra-fine granularity. This methodology opens the way for cooperative heterogeneous computing: while the accelerator takes care of the bulk of the CNN workload, the ARM cores can seamlessly execute hard-to-accelerate parts of the computational graph, taking advantage of the NEON vector engines to further speed up computation. Through the companion NeuDNN SW stack, NEURAghe supports end-to-end CNN-based classification with a peak performance of 169 Gops/s, and an energy efficiency of 17 Gops/W. Thanks to our heterogeneous computing model, our platform improves upon the state-of-the-art, achieving a frame rate of 5.5 fps on the end-to-end execution of VGG-16, and 6.6 fps on ResNet-18.