Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

José Fernando Núñez

Synthetic ECG Generation for Data Augmentation and Transfer Learning in Arrhythmia Classification

Nov 27, 2024

José Fernando Núñez, Jamie Arjona, Javier Béjar

Figure 1 for Synthetic ECG Generation for Data Augmentation and Transfer Learning in Arrhythmia Classification

Figure 2 for Synthetic ECG Generation for Data Augmentation and Transfer Learning in Arrhythmia Classification

Figure 3 for Synthetic ECG Generation for Data Augmentation and Transfer Learning in Arrhythmia Classification

Figure 4 for Synthetic ECG Generation for Data Augmentation and Transfer Learning in Arrhythmia Classification

Abstract:Deep learning models need a sufficient amount of data in order to be able to find the hidden patterns in it. It is the purpose of generative modeling to learn the data distribution, thus allowing us to sample more data and augment the original dataset. In the context of physiological data, and more specifically electrocardiogram (ECG) data, given its sensitive nature and expensive data collection, we can exploit the benefits of generative models in order to enlarge existing datasets and improve downstream tasks, in our case, classification of heart rhythm. In this work, we explore the usefulness of synthetic data generated with different generative models from Deep Learning namely Diffweave, Time-Diffusion and Time-VQVAE in order to obtain better classification results for two open source multivariate ECG datasets. Moreover, we also investigate the effects of transfer learning, by fine-tuning a synthetically pre-trained model and then progressively adding increasing proportions of real data. We conclude that although the synthetic samples resemble the real ones, the classification improvement when simply augmenting the real dataset is barely noticeable on individual datasets, but when both datasets are merged the results show an increase across all metrics for the classifiers when using synthetic samples as augmented data. From the fine-tuning results the Time-VQVAE generative model has shown to be superior to the others but not powerful enough to achieve results close to a classifier trained with real data only. In addition, methods and metrics for measuring closeness between synthetic data and the real one have been explored as a side effect of the main research questions of this study.

Via

Access Paper or Ask Questions