Neonates are highly susceptible to seizures, often leading to short or long-term neurological impairments. However, clinical manifestations of neonatal seizures are subtle and often lead to misdiagnoses. This increases the risk of prolonged, untreated seizure activity and subsequent brain injury. Continuous video electroencephalogram (cEEG) monitoring is the gold standard for seizure detection. However, this is an expensive evaluation that requires expertise and time. In this study, we propose a convolutional neural network-based model for early prediction of neonatal seizures by distinguishing between interictal and preictal states of the EEG. Our model is patient-independent, enabling generalization across multiple subjects, and utilizes mel-frequency cepstral coefficient matrices extracted from multichannel EEG and electrocardiogram (ECG) signals as input features. Trained and validated on the Helsinki neonatal EEG dataset with 10-fold cross-validation, the proposed model achieved an average accuracy of 97.52%, sensitivity of 98.31%, specificity of 96.39%, and F1-score of 97.95%, enabling accurate seizure prediction up to 30 minutes before onset. The inclusion of ECG alongside EEG improved the F1-score by 1.42%, while the incorporation of an attention mechanism yielded an additional 0.5% improvement. To enhance transparency, we incorporated SHapley Additive exPlanations (SHAP) as an explainable artificial intelligence method to interpret the model and provided localization of seizure focus using scalp plots. The overall results demonstrate the model's potential for minimally supervised deployment in neonatal intensive care units, enabling timely and reliable prediction of neonatal seizures, while demonstrating strong generalization capability across unseen subjects through transfer learning.
Electrocardiogram (ECG) signals are often degraded by various noise sources such as baseline wander, motion artifacts, and electromyographic interference, posing a major challenge in clinical settings. This paper presents a lightweight deep learning-based denoising framework, forming a compact autoencoder architecture. The model was trained under severe noise conditions (-5 dB signal-to-noise ratio (SNR)) using a rigorously partitioned dataset to ensure no data leakage and robust generalization. Extensive evaluations were conducted across seven noise configurations and three SNR levels (-5 dB, 0 dB, and +5 dB), showing consistent denoising performance with minimal morphological distortion, critical for maintaining diagnostic integrity. In particular, tests on clinically vital rhythms such as ventricular tachycardia (VT) and ventricular fibrillation (VF) confirm that the proposed model effectively suppresses noise without altering arrhythmic features essential for diagnosis. Visual and quantitative assessments, including SNR improvement, RMSE, and correlation metrics, validate the model's efficacy in preserving waveform fidelity. To demonstrate real-world applicability, the model was deployed on a Raspberry Pi 4 using TensorFlow Lite with float16 precision. Inference latency was measured at just 1.41 seconds per 14-second ECG segment, indicating feasibility for near-real-time use in edge devices. Overall, this study introduces a lightweight, hardware-validated, and morphologically reliable ECG denoising solution suitable for integration into portable or wearable healthcare systems.
Cardiovascular disease (CVD) is a leading cause of mortality worldwide. Electrocardiograms (ECGs) are the most widely used non-invasive tool for cardiac assessment, yet large, well-annotated ECG corpora are scarce due to cost, privacy, and workflow constraints. Generating ECGs can be beneficial for the mechanistic understanding of cardiac electrical activity, enable the construction of large, heterogeneous, and unbiased datasets, and facilitate privacy-preserving data sharing. Generating realistic ECG signals from clinical context is important yet underexplored. Recent work has leveraged diffusion models for text-to-ECG generation, but two challenges remain: (i) existing methods often overlook the physiological simulator knowledge of cardiac activity; and (ii) they ignore broader, experience-based clinical knowledge grounded in real-world practice. To address these gaps, we propose SE-Diff, a novel physiological simulator and experience enhanced diffusion model for comprehensive ECG generation. SE-Diff integrates a lightweight ordinary differential equation (ODE)-based ECG simulator into the diffusion process via a beat decoder and simulator-consistent constraints, injecting mechanistic priors that promote physiologically plausible waveforms. In parallel, we design an LLM-powered experience retrieval-augmented strategy to inject clinical knowledge, providing more guidance for ECG generation. Extensive experiments on real-world ECG datasets demonstrate that SE-Diff improves both signal fidelity and text-ECG semantic alignment over baselines, proving its superiority for text-to-ECG generation. We further show that the simulator-based and experience-based knowledge also benefit downstream ECG classification.




Early and accurate detection of cardiac arrhythmias is vital for timely diagnosis and intervention. We propose a lightweight deep learning model combining 1D Convolutional Neural Networks (CNN), attention mechanisms, and Bidirectional Long Short-Term Memory (BiLSTM) for classifying arrhythmias from both 12-lead and single-lead ECGs. Evaluated on the CPSC 2018 dataset, the model addresses class imbalance using a class-weighted loss and demonstrates superior accuracy and F1- scores over baseline models. With only 0.945 million parameters, our model is well-suited for real-time deployment in wearable health monitoring systems.
Despite the rapid advancements of electrocardiogram (ECG) signal diagnosis and analysis methods through deep learning, two major hurdles still limit their clinical adoption: the lack of versatility in processing ECG signals with diverse configurations, and the inadequate detection of risk signals due to sample imbalances. Addressing these challenges, we introduce VersAtile and Risk-Sensitive cardiac diagnosis (VARS), an innovative approach that employs a graph-based representation to uniformly model heterogeneous ECG signals. VARS stands out by transforming ECG signals into versatile graph structures that capture critical diagnostic features, irrespective of signal diversity in the lead count, sampling frequency, and duration. This graph-centric formulation also enhances diagnostic sensitivity, enabling precise localization and identification of abnormal ECG patterns that often elude standard analysis methods. To facilitate representation transformation, our approach integrates denoising reconstruction with contrastive learning to preserve raw ECG information while highlighting pathognomonic patterns. We rigorously evaluate the efficacy of VARS on three distinct ECG datasets, encompassing a range of structural variations. The results demonstrate that VARS not only consistently surpasses existing state-of-the-art models across all these datasets but also exhibits substantial improvement in identifying risk signals. Additionally, VARS offers interpretability by pinpointing the exact waveforms that lead to specific model outputs, thereby assisting clinicians in making informed decisions. These findings suggest that our VARS will likely emerge as an invaluable tool for comprehensive cardiac health assessment.




Cardiovascular disease causes high rates of mortality worldwide. Coronary artery calcium (CAC) scoring is a powerful tool to stratify the risk of atherosclerotic cardiovascular disease. Current scoring practices require time-intensive semiautomatic analysis of cardiac computed tomography by radiologists and trained radiographers. The purpose of this study is to develop a deep learning convolutional neural networks (CNN) model to classify the calcium score in cardiac, non-contrast computed tomography images into one of six clinical categories. A total of 68 patient scans were retrospectively obtained together with their respective reported semiautomatic calcium score using an ECG-gated GE Discovery 570 Cardiac SPECT/CT camera. The dataset was divided into training, validation and test sets. Using the semiautomatic CAC score as the reference label, the model demonstrated high performance on a six-class CAC scoring categorisation task. Of the scans analysed, the model misclassified 32 cases, tending towards overestimating the CAC in 26 out of 32 misclassifications. Overall, the model showed high agreement (Cohen's kappa of 0.962), an overall accuracy of 96.5% and high generalisability. The results suggest that the model outputs were accurate and consistent with current semiautomatic practice, with good generalisability to test data. The model demonstrates the viability of a CNN model to stratify the calcium score into an expanded set of six clinical categories.




Wolff-Parkinson-White (WPW) syndrome is a cardiac electrophysiology (EP) disorder caused by the presence of an accessory pathway (AP) that bypasses the atrioventricular node, faster ventricular activation rate, and provides a substrate for atrio-ventricular reentrant tachycardia (AVRT). Accurate localization of the AP is critical for planning and guiding catheter ablation procedures. While traditional diagnostic tree (DT) methods and more recent machine learning (ML) approaches have been proposed to predict AP location from surface electrocardiogram (ECG), they are often constrained by limited anatomical localization resolution, poor interpretability, and the use of small clinical datasets. In this study, we present a Deep Learning (DL) model for the localization of single manifest APs across 24 cardiac regions, trained on a large, physiologically realistic database of synthetic ECGs generated using a personalized virtual heart model. We also integrate eXplainable Artificial Intelligence (XAI) methods, Guided Backpropagation, Grad-CAM, and Guided Grad-CAM, into the pipeline. This enables interpretation of DL decision-making and addresses one of the main barriers to clinical adoption: lack of transparency in ML predictions. Our model achieves localization accuracy above 95%, with a sensitivity of 94.32% and specificity of 99.78%. XAI outputs are physiologically validated against known depolarization patterns, and a novel index is introduced to identify the most informative ECG leads for AP localization. Results highlight lead V2 as the most critical, followed by aVF, V1, and aVL. This work demonstrates the potential of combining cardiac digital twins with explainable DL to enable accurate, transparent, and non-invasive AP localization.
Localisation tasks in biomedical data often require models to learn meaningful spatial or temporal relationships from signals with complex intensity distributions. A common strategy, exemplified by CoordConv layers, is to append coordinate channels to convolutional inputs, enabling networks to learn absolute positions. In this work, we propose a signal intensity-weighted coordinate representation that replaces the pure coordinate channels with channels scaled by local signal intensity. This modification embeds an intensity-position coupling directly in the input representation, introducing a simple and modality-agnostic inductive bias. We evaluate the approach on two distinct localisation problems: (i) predicting the time of morphological transition in 20-second, two-lead ECG signals, and (ii) regressing the coordinates of nuclear centres in cytological images from the SiPaKMeD dataset. In both cases, the proposed representation yields faster convergence and higher generalisation performance relative to conventional coordinate-channel approaches, demonstrating its effectiveness across both one-dimensional and two-dimensional biomedical signals.



Polysomnography signals are highly heterogeneous, varying in modality composition (e.g., EEG, EOG, ECG), channel availability (e.g., frontal, occipital EEG), and acquisition protocols across datasets and clinical sites. Most existing models that process polysomnography data rely on a fixed subset of modalities or channels and therefore neglect to fully exploit its inherently multimodal nature. We address this limitation by introducing NAP (Neural Aggregator of Predictions), an attention-based model which learns to combine multiple prediction streams using a tri-axial attention mechanism that captures temporal, spatial, and predictor-level dependencies. NAP is trained to adapt to different input dimensions. By aggregating outputs from frozen, pretrained single-channel models, NAP consistently outperforms individual predictors and simple ensembles, achieving state-of-the-art zero-shot generalization across multiple datasets. While demonstrated in the context of automated sleep staging from polysomnography, the proposed approach could be extended to other multimodal physiological applications.




Non-contact electrocardiogram (ECG) reconstruction from radar signals offers a promising approach for unobtrusive cardiac monitoring. We present LifWavNet, a lifting wavelet network based on a multi-resolution analysis and synthesis (MRAS) model for radar-to-ECG reconstruction. Unlike prior models that use fixed wavelet approaches, LifWavNet employs learnable lifting wavelets with lifting and inverse lifting units to adaptively capture radar signal features and synthesize physiologically meaningful ECG waveforms. To improve reconstruction fidelity, we introduce a multi-resolution short-time Fourier transform (STFT) loss, that enforces consistency with the ground-truth ECG in both temporal and spectral domains. Evaluations on two public datasets demonstrate that LifWavNet outperforms state-of-the-art methods in ECG reconstruction and downstream vital sign estimation (heart rate and heart rate variability). Furthermore, intermediate feature visualization highlights the interpretability of multi-resolution decomposition and synthesis in radar-to-ECG reconstruction. These results establish LifWavNet as a robust framework for radar-based non-contact ECG measurement.