Abstract:This paper introduces TinySaver, an early-exit-like dynamic model compression approach which employs tiny models to substitute large models adaptively. Distinct from traditional compression techniques, dynamic methods like TinySaver can leverage the difficulty differences to allow certain inputs to complete their inference processes early, thereby conserving computational resources. Most existing early exit designs are implemented by attaching additional network branches to the model's backbone. Our study, however, reveals that completely independent tiny models can replace a substantial portion of the larger models' job with minimal impact on performance. Employing them as the first exit can remarkably enhance computational efficiency. By searching and employing the most appropriate tiny model as the computational saver for a given large model, the proposed approaches work as a novel and generic method to model compression. This finding will help the research community in exploring new compression methods to address the escalating computational demands posed by rapidly evolving AI models. Our evaluation of this approach in ImageNet-1k classification demonstrates its potential to reduce the number of compute operations by up to 90%, with only negligible losses in performance, across various modern vision models. The code of this work will be available.
Abstract:Modern deep learning (DL) models necessitate the employment of scaling and compression techniques for effective deployment in resource-constrained environments. Most existing techniques, such as pruning and quantization are generally static. On the other hand, dynamic compression methods, such as early exits, reduce complexity by recognizing the difficulty of input samples and allocating computation as needed. Dynamic methods, despite their superior flexibility and potential for co-existing with static methods, pose significant challenges in terms of implementation due to any changes in dynamic parts will influence subsequent processes. Moreover, most current dynamic compression designs are monolithic and tightly integrated with base models, thereby complicating the adaptation to novel base models. This paper introduces DyCE, an dynamic configurable early-exit framework that decouples design considerations from each other and from the base model. Utilizing this framework, various types and positions of exits can be organized according to predefined configurations, which can be dynamically switched in real-time to accommodate evolving performance-complexity requirements. We also propose techniques for generating optimized configurations based on any desired trade-off between performance and computational complexity. This empowers future researchers to focus on the improvement of individual exits without latent compromise of overall system performance. The efficacy of this approach is demonstrated through image classification tasks with deep CNNs. DyCE significantly reduces the computational complexity by 23.5% of ResNet152 and 25.9% of ConvNextv2-tiny on ImageNet, with accuracy reductions of less than 0.5%. Furthermore, DyCE offers advantages over existing dynamic methods in terms of real-time configuration and fine-grained performance tuning.
Abstract:Sudden cardiac death and arrhythmia account for a large percentage of all deaths worldwide. Electrocardiography (ECG) is the most widely used screening tool for cardiovascular diseases. Traditionally, ECG signals are classified manually, requiring experience and great skill, while being time-consuming and prone to error. Thus machine learning algorithms have been widely adopted because of their ability to perform complex data analysis. Features derived from the points of interest in ECG - mainly Q, R, and S, are widely used for arrhythmia detection. In this work, we demonstrate improved performance for ECG classification using hybrid features and three different models, building on a 1-D convolutional neural network (CNN) model that we had proposed in the past. An RR interval features based model proposed in this work achieved an accuracy of 98.98%, which is an improvement over the baseline model. To make the model immune to noise, we updated the model using frequency features and achieved good sustained performance in presence of noise with a slightly lower accuracy of 98.69%. Further, another model combining the frequency features and the RR interval features was developed, which achieved a high accuracy of 99% with good sustained performance in noisy environments. Due to its high accuracy and noise immunity, the proposed model which combines multiple hybrid features, is well suited for ambulatory wearable sensing applications.
Abstract:Using smart wearable devices to monitor patients electrocardiogram (ECG) for real-time detection of arrhythmias can significantly improve healthcare outcomes. Convolutional neural network (CNN) based deep learning has been used successfully to detect anomalous beats in ECG. However, the computational complexity of existing CNN models prohibits them from being implemented in low-powered edge devices. Usually, such models are complex with lots of model parameters which results in large number of computations, memory, and power usage in edge devices. Network pruning techniques can reduce model complexity at the expense of performance in CNN models. This paper presents a novel multistage pruning technique that reduces CNN model complexity with negligible loss in performance compared to existing pruning techniques. An existing CNN model for ECG classification is used as a baseline reference. At 60% sparsity, the proposed technique achieves 97.7% accuracy and an F1 score of 93.59% for ECG classification tasks. This is an improvement of 3.3% and 9% for accuracy and F1 Score respectively, compared to traditional pruning with fine-tuning approach. Compared to the baseline model, we also achieve a 60.4% decrease in run-time complexity.
Abstract:The abnormal pause or rate reduction in breathing is known as the sleep-apnea hypopnea syndrome and affects the quality of sleep of an individual. A novel method for the detection of sleep apnea events (pause in breathing) from peripheral oxygen saturation (SpO2) signals obtained from wearable devices is discussed in this paper. The paper details an apnea detection algorithm of a very high resolution on a per-second basis for which a 1-dimensional convolutional neural network -- which we termed SomnNET -- is developed. This network exhibits an accuracy of 97.08% and outperforms several lower resolution state-of-the-art apnea detection methods. The feasibility of model pruning and binarization to reduce the computational complexity is explored. The pruned network with 80% sparsity exhibited an accuracy of 89.75%, and the binarized network exhibited an accuracy of 68.22%. The performance of the proposed networks is compared against several state-of-the-art algorithms.
Abstract:Internet of Things (IoT) enabled wearable sensors for health monitoring are widely used to reduce the cost of personal healthcare and improve quality of life. The sleep apnea-hypopnea syndrome, characterized by the abnormal reduction or pause in breathing, greatly affects the quality of sleep of an individual. This paper introduces a novel method for apnea detection (pause in breathing) from electrocardiogram (ECG) signals obtained from wearable devices. The novelty stems from the high resolution of apnea detection on a second-by-second basis, and this is achieved using a 1-dimensional convolutional neural network for feature extraction and detection of sleep apnea events. The proposed method exhibits an accuracy of 99.56% and a sensitivity of 96.05%. This model outperforms several lower resolution state-of-the-art apnea detection methods. The complexity of the proposed model is analyzed. We also analyze the feasibility of model pruning and binarization to reduce the resource requirements on a wearable IoT device. The pruned model with 80\% sparsity exhibited an accuracy of 97.34% and a sensitivity of 86.48%. The binarized model exhibited an accuracy of 75.59% and sensitivity of 63.23%. The performance of low complexity patient-specific models derived from the generic model is also studied to analyze the feasibility of retraining existing models to fit patient-specific requirements. The patient-specific models on average exhibited an accuracy of 97.79% and sensitivity of 92.23%. The source code for this work is made publicly available.