Abstract:Through automation, deep learning (DL) can enhance the analysis of transesophageal echocardiography (TEE) images. However, DL methods require large amounts of high-quality data to produce accurate results, which is difficult to satisfy. Data augmentation is commonly used to tackle this issue. In this work, we develop a pipeline to generate synthetic TEE images and corresponding semantic labels. The proposed data generation pipeline expands on an existing pipeline that generates synthetic transthoracic echocardiography images by transforming slices from anatomical models into synthetic images. We also demonstrate that such images can improve DL network performance through a left-ventricle semantic segmentation task. For the pipeline's unpaired image-to-image (I2I) translation section, we explore two generative methods: CycleGAN and contrastive unpaired translation. Next, we evaluate the synthetic images quantitatively using the Fr\'echet Inception Distance (FID) Score and qualitatively through a human perception quiz involving expert cardiologists and the average researcher. In this study, we achieve a dice score improvement of up to 10% when we augment datasets with our synthetic images. Furthermore, we compare established methods of assessing unpaired I2I translation and observe a disagreement when evaluating the synthetic images. Finally, we see which metric better predicts the generated data's efficacy when used for data augmentation.
Abstract:How well the heart is functioning can be quantified through measurements of myocardial deformation via echocardiography. Clinical assessment of cardiac function is generally focused on global indices of relative shortening, however, territorial, and segmental strain indices have shown to be abnormal in regions of myocardial disease, such as scar. In this work, we propose a single framework to predict myocardial disease substrates at global, territorial, and segmental levels using regional myocardial strain traces as input to a convolutional neural network (CNN)-based classification algorithm. An anatomically meaningful representation of the input data from the clinically standard bullseye representation to a multi-channel 2D image is proposed, to formulate the task as an image classification problem, thus enabling the use of state-of-the-art neural network configurations. A Fully Convolutional Network (FCN) is trained to detect and localize myocardial scar from regional left ventricular (LV) strain patterns. Simulated regional strain data from a controlled dataset of virtual patients with varying degrees and locations of myocardial scar is used for training and validation. The proposed method successfully detects and localizes the scars on 98% of the 5490 left ventricle (LV) segments of the 305 patients in the test set using strain traces only. Due to the sparse existence of scar, only 10% of the LV segments in the virtual patient cohort have scar. Taking the imbalance into account, the class balanced accuracy is calculated as 95%. The performance is reported on global, territorial, and segmental levels. The proposed method proves successful on the strain traces of the virtual cohort and offers the potential to solve the regional myocardial scar detection problem on the strain traces of the real patient cohorts.
Abstract:To facilitate diagnosis on cardiac ultrasound (US), clinical practice has established several standard views of the heart, which serve as reference points for diagnostic measurements and define viewports from which images are acquired. Automatic view recognition involves grouping those images into classes of standard views. Although deep learning techniques have been successful in achieving this, they still struggle with fully verifying the suitability of an image for specific measurements due to factors like the correct location, pose, and potential occlusions of cardiac structures. Our approach goes beyond view classification and incorporates a 3D mesh reconstruction of the heart that enables several more downstream tasks, like segmentation and pose estimation. In this work, we explore learning 3D heart meshes via graph convolutions, using similar techniques to learn 3D meshes in natural images, such as human pose estimation. As the availability of fully annotated 3D images is limited, we generate synthetic US images from 3D meshes by training an adversarial denoising diffusion model. Experiments were conducted on synthetic and clinical cases for view recognition and structure detection. The approach yielded good performance on synthetic images and, despite being exclusively trained on synthetic data, it already showed potential when applied to clinical images. With this proof-of-concept, we aim to demonstrate the benefits of graphs to improve cardiac view recognition that can ultimately lead to better efficiency in cardiac diagnosis.