Abstract:The integration of Artificial Intelligence (AI) into clinical research has great potential to reveal patterns that are difficult for humans to detect, creating impactful connections between inputs and clinical outcomes. However, these methods often require large amounts of labeled data, which can be difficult to obtain in healthcare due to strict privacy laws and the need for experts to annotate data. This requirement creates a bottleneck when investigating unexplored clinical questions. This study explores the application of Self-Supervised Learning (SSL) as a way to obtain preliminary results from clinical studies with limited sized cohorts. To assess our approach, we focus on an underexplored clinical task: screening subjects for Paroxysmal Atrial Fibrillation (P-AF) using remote monitoring, single-lead ECG signals captured during normal sinus rhythm. We evaluate state-of-the-art SSL methods alongside supervised learning approaches, where SSL outperforms supervised learning in this task of interest. More importantly, it prevents misleading conclusions that may arise from poor performance in the latter paradigm when dealing with limited cohort settings.
Abstract:Wearable sensing devices, such as Electrocardiogram (ECG) heart-rate monitors, will play a crucial role in the future of digital health. This continuous monitoring leads to massive unlabeled data, incentivizing the development of unsupervised learning frameworks. While Masked Data Modelling (MDM) techniques have enjoyed wide use, their direct application to single-lead ECG data is suboptimal due to the decoder's difficulty handling irregular heartbeat intervals when no contextual information is provided. In this paper, we present Cueing the Predictor Increments the Detailing (CuPID), a novel MDM method tailored to single-lead ECGs. CuPID enhances existing MDM techniques by cueing spectrogram-derived context to the decoder, thus incentivizing the encoder to produce more detailed representations. This has a significant impact on the encoder's performance across a wide range of different configurations, leading CuPID to outperform state-of-the-art methods in a variety of downstream tasks.