Abstract:Existing speech processing systems consist of different modules, individually optimized for a specific task such as acoustic modelling or feature extraction. In addition to not assuring optimality of the system, the disjoint nature of current speech processing systems make them unsuitable for ubiquitous health applications. We propose WaDeNet, an end-to-end model for mobile speech processing. In order to incorporate spectral features, WaDeNet embeds wavelet decomposition of the speech signal within the architecture. This allows WaDeNet to learn from spectral features in an end-to-end manner, thus alleviating the need for feature extraction and successive modules that are currently present in speech processing systems. WaDeNet outperforms the current state of the art in datasets that involve speech for mobile health applications such as non-invasive emotion recognition. WaDeNet achieves an average increase in accuracy of 6.36% when compared to the existing state of the art models. Additionally, WaDeNet is considerably lighter than a simple CNNs with a similar architecture.
Abstract:Continuous monitoring of cardiac activity is paramount to understanding the functioning of the heart in addition to identifying precursors to conditions such as Atrial Fibrillation. Through continuous cardiac monitoring, early indications of any potential disorder can be detected before the actual event, allowing timely preventive measures to be taken. Electrocardiography (ECG) is an established standard for monitoring the function of the heart for clinical and non-clinical applications, but its electrode-based implementation makes it cumbersome, especially for uninterrupted monitoring. Hence we propose SeismoNet, a Deep Convolutional Neural Network which aims to provide an end-to-end solution to robustly observe heart activity from Seismocardiogram (SCG) signals. These SCG signals are motion-based and can be acquired in an easy, user-friendly fashion. Furthermore, the use of deep learning enables the detection of R-peaks directly from SCG signals in spite of their noise-ridden morphology and obviates the need for extracting hand-crafted features. SeismoNet was modelled on the publicly available CEBS dataset and achieved a high overall Sensitivity and Positive Predictive Value of 0.98 and 0.98 respectively.
Abstract:Continuous monitoring of blood oxygen saturation levels is vital for patients with pulmonary disorders. Traditionally, SpO$_2$ monitoring has been carried out using transmittance pulse oximeters due to its dependability. However, SpO$_2$ measurement from transmittance pulse oximeters is limited to peripheral regions. This becomes a disadvantage at very low temperatures as blood perfusion to the peripherals decreases. On the other hand, reflectance pulse oximeters can be used at various sites like finger, wrist, chest and forehead. Additionally, reflectance pulse oximeters can be scaled down to affordable patches that do not interfere with the user's diurnal activities. However, accurate SpO$_2$ estimation from reflectance pulse oximeters is challenging due to its patient dependent, subjective nature of measurement. Recently, a Machine Learning (ML) method was used to model reflectance waveforms onto SpO$_2$ obtained from transmittance waveforms. However, the generalizability of the model to new patients was not tested. In light of this, the current work implemented multiple ML based approaches which were subsequently found to be incapable of generalizing to new patients. Furthermore, a minimally calibrated data driven approach was utilized in order to obtain SpO$_2$ from reflectance PPG waveforms. The proposed solution produces an average mean absolute error of 1.81\% on unseen patients which is well within the clinically permissible error of 2\%. Two statistical tests were conducted to establish the effectiveness of the proposed method.
Abstract:Freezing of Gait (FoG) is a common gait deficit among patients diagnosed with Parkinson's Disease (PD). In order to help these patients recover from FoG episodes, Rhythmic Auditory Stimulation (RAS) is needed. The authors propose a ubiquitous embedded system that detects FOG events with a Machine Learning (ML) subsystem from accelerometer signals . By making inferences on-device, we avoid issues prevalent in cloud-based systems such as latency and network connection dependency. The resource-efficient classifier used, reduces the model size requirements by approximately 400 times compared to the best performing standard ML systems, with a trade-off of a mere 1.3% in best classification accuracy. The aforementioned trade-off facilitates deployability in a wide range of embedded devices including microcontroller based systems. The research also explores the optimization procedure to deploy the model on an ATMega2560 microcontroller with a minimum system latency of 44.5 ms. The smallest model size of the proposed resource efficient ML model was 1.4 KB with an average recall score of 93.58%.