Abstract:This paper addresses the critical need for online action representation, which is essential for various applications like rehabilitation, surveillance, etc. The task can be defined as representation of actions as soon as they happen in a streaming video without access to video frames in the future. Most of the existing methods use predefined window sizes for video segments, which is a restrictive assumption on the dynamics. The proposed method employs a change detection algorithm to automatically segment action sequences, which form meaningful sub-actions and subsequently fit symbolic generative motion programs to the clipped segments. We determine the start time and end time of segments using change detection followed by a piece-wise linear fit algorithm on joint angle and bone length sequences. Domain-specific symbolic primitives are fit to pose keypoint trajectories of those extracted segments in order to obtain a higher level semantic representation. Since this representation is part-based, it is complementary to the compositional nature of human actions, i.e., a complex activity can be broken down into elementary sub-actions. We show the effectiveness of this representation in the downstream task of class agnostic repetition detection. We propose a repetition counting algorithm based on consecutive similarity matching of primitives, which can do online repetition counting. We also compare the results with a similar but offline repetition counting algorithm. The results of the experiments demonstrate that, despite operating online, the proposed method performs better or on par with the existing method.
Abstract:Preterm babies in the Neonatal Intensive Care Unit (NICU) have to undergo continuous monitoring of their cardiac health. Conventional monitoring approaches are contact-based, making the neonates prone to various nosocomial infections. Video-based monitoring approaches have opened up potential avenues for contactless measurement. This work presents a pipeline for remote estimation of cardiopulmonary signals from videos in NICU setup. We have proposed an end-to-end deep learning (DL) model that integrates a non-learning based approach to generate surrogate ground truth (SGT) labels for supervision, thus refraining from direct dependency on true ground truth labels. We have performed an extended qualitative and quantitative analysis to examine the efficacy of our proposed DL-based pipeline and achieved an overall average mean absolute error of 4.6 beats per minute (bpm) and root mean square error of 6.2 bpm in the estimated heart rate.
Abstract:Automatic detection of R-peaks in an Electrocardiogram signal is crucial in a multitude of applications including Heart Rate Variability (HRV) analysis and Cardio Vascular Disease(CVD) diagnosis. Although there have been numerous approaches that have successfully addressed the problem, there has been a notable dip in the performance of these existing detectors on ECG episodes that contain noise and HRV Irregulates. On the other hand, Deep Learning(DL) based methods have shown to be adept at modelling data that contain noise. In image to image translation, Unet is the fundamental block in many of the networks. In this work, a novel application of the Unet combined with Inception and Residual blocks is proposed to perform the extraction of R-peaks from an ECG. Furthermore, the problem formulation also robustly deals with issues of variability and sparsity of ECG R-peaks. The proposed network was trained on a database containing ECG episodes that have CVD and was tested against three traditional ECG detectors on a validation set. The model achieved an F1 score of 0.9837, which is a substantial improvement over the other beat detectors. Furthermore, the model was also evaluated on three other databases. The proposed network achieved high F1 scores across all datasets which established its generalizing capacity. Additionally, a thorough analysis of the model's performance in the presence of different levels of noise was carried out.
Abstract:Continuous monitoring of blood oxygen saturation levels is vital for patients with pulmonary disorders. Traditionally, SpO$_2$ monitoring has been carried out using transmittance pulse oximeters due to its dependability. However, SpO$_2$ measurement from transmittance pulse oximeters is limited to peripheral regions. This becomes a disadvantage at very low temperatures as blood perfusion to the peripherals decreases. On the other hand, reflectance pulse oximeters can be used at various sites like finger, wrist, chest and forehead. Additionally, reflectance pulse oximeters can be scaled down to affordable patches that do not interfere with the user's diurnal activities. However, accurate SpO$_2$ estimation from reflectance pulse oximeters is challenging due to its patient dependent, subjective nature of measurement. Recently, a Machine Learning (ML) method was used to model reflectance waveforms onto SpO$_2$ obtained from transmittance waveforms. However, the generalizability of the model to new patients was not tested. In light of this, the current work implemented multiple ML based approaches which were subsequently found to be incapable of generalizing to new patients. Furthermore, a minimally calibrated data driven approach was utilized in order to obtain SpO$_2$ from reflectance PPG waveforms. The proposed solution produces an average mean absolute error of 1.81\% on unseen patients which is well within the clinically permissible error of 2\%. Two statistical tests were conducted to establish the effectiveness of the proposed method.
Abstract:Cardiac arrhythmia is a prevalent and significant cause of morbidity and mortality among cardiac ailments. Early diagnosis is crucial in providing intervention for patients suffering from cardiac arrhythmia. Traditionally, diagnosis is performed by examination of the Electrocardiogram (ECG) by a cardiologist. This method of diagnosis is hampered by the lack of accessibility to expert cardiologists. For quite some time, signal processing methods had been used to automate arrhythmia diagnosis. However, these traditional methods require expert knowledge and are unable to model a wide range of arrhythmia. Recently, Deep Learning methods have provided solutions to performing arrhythmia diagnosis at scale. However, the black-box nature of these models prohibit clinical interpretation of cardiac arrhythmia. There is a dire need to correlate the obtained model outputs to the corresponding segments of the ECG. To this end, two methods are proposed to provide interpretability to the models. The first method is a novel application of Gradient-weighted Class Activation Map (Grad-CAM) for visualizing the saliency of the CNN model. In the second approach, saliency is derived by learning the input deletion mask for the LSTM model. The visualizations are provided on a model whose competence is established by comparisons against baselines. The results of model saliency not only provide insight into the prediction capability of the model but also aligns with the medical literature for the classification of cardiac arrhythmia.
Abstract:For the task of medical image segmentation, fully convolutional network (FCN) based architectures have been extensively used with various modifications. A rising trend in these architectures is to employ joint-learning of the target region with an auxiliary task, a method commonly known as multi-task learning. These approaches help impose smoothness and shape priors, which vanilla FCN approaches do not necessarily incorporate. In this paper, we propose a novel plug-and-play module, which we term as Conv-MCD, which exploits structural information in two ways - i) using the contour map and ii) using the distance map, both of which can be obtained from ground truth segmentation maps with no additional annotation costs. The key benefit of our module is the ease of its addition to any state-of-the-art architecture, resulting in a significant improvement in performance with a minimal increase in parameters. To substantiate the above claim, we conduct extensive experiments using 4 state-of-the-art architectures across various evaluation metrics, and report a significant increase in performance in relation to the base networks. In addition to the aforementioned experiments, we also perform ablative studies and visualization of feature maps to further elucidate our approach.
Abstract:Continuous monitoring of cardiac health under free living condition is crucial to provide effective care for patients undergoing post operative recovery and individuals with high cardiac risk like the elderly. Capacitive Electrocardiogram (cECG) is one such technology which allows comfortable and long term monitoring through its ability to measure biopotential in conditions without having skin contact. cECG monitoring can be done using many household objects like chairs, beds and even car seats allowing for seamless monitoring of individuals. This method is unfortunately highly susceptible to motion artifacts which greatly limits its usage in clinical practice. The current use of cECG systems has been limited to performing rhythmic analysis. In this paper we propose a novel end-to-end deep learning architecture to perform the task of denoising capacitive ECG. The proposed network is trained using motion corrupted three channel cECG and a reference LEAD I ECG collected on individuals while driving a car. Further, we also propose a novel joint loss function to apply loss on both signal and frequency domain. We conduct extensive rhythmic analysis on the model predictions and the ground truth. We further evaluate the signal denoising using Mean Square Error(MSE) and Cross Correlation between model predictions and ground truth. We report MSE of 0.167 and Cross Correlation of 0.476. The reported results highlight the feasibility of performing morphological analysis using the filtered cECG. The proposed approach can allow for continuous and comprehensive monitoring of the individuals in free living conditions.
Abstract:Photoplethysmogram (PPG) is increasingly used to provide monitoring of the cardiovascular system under ambulatory conditions. Wearable devices like smartwatches use PPG to allow long term unobtrusive monitoring of heart rate in free living conditions. PPG based heart rate measurement is unfortunately highly susceptible to motion artifacts, particularly when measured from the wrist. Traditional machine learning and deep learning approaches rely on tri-axial accelerometer data along with PPG to perform heart rate estimation. The conventional learning based approaches have not addressed the need for device-specific modeling due to differences in hardware design among PPG devices. In this paper, we propose a novel end to end deep learning model to perform heart rate estimation using 8 second length input PPG signal. We evaluate the proposed model on the IEEE SPC 2015 dataset, achieving a mean absolute error of 3.36+-4.1BPM for HR estimation on 12 subjects without requiring patient specific training. We also studied the feasibility of applying transfer learning along with sparse retraining from a comprehensive in house PPG dataset for heart rate estimation across PPG devices with different hardware design.
Abstract:Respiratory ailments afflict a wide range of people and manifests itself through conditions like asthma and sleep apnea. Continuous monitoring of chronic respiratory ailments is seldom used outside the intensive care ward due to the large size and cost of the monitoring system. While Electrocardiogram (ECG) based respiration extraction is a validated approach, its adoption is limited by access to a suitable continuous ECG monitor. Recently, due to the widespread adoption of wearable smartwatches with in-built Photoplethysmogram (PPG) sensor, it is being considered as a viable candidate for continuous and unobtrusive respiration monitoring. Research in this domain, however, has been predominantly focussed on estimating respiration rate from PPG. In this work, a novel end-to-end deep learning network called RespNet is proposed to perform the task of extracting the respiration signal from a given input PPG as opposed to extracting respiration rate. The proposed network was trained and tested on two different datasets utilizing different modalities of reference respiration signal recordings. Also, the similarity and performance of the proposed network against two conventional signal processing approaches for extracting respiration signal were studied. The proposed method was tested on two independent datasets with a Mean Squared Error of 0.262 and 0.145. The Cross-Correlation coefficient of the respective datasets were found to be 0.933 and 0.931. The reported errors and similarity was found to be better than conventional approaches. The proposed approach would aid clinicians to provide comprehensive evaluation of sleep-related respiratory conditions and chronic respiratory ailments while being comfortable and inexpensive for the patient.