King's College London
Abstract:Echocardiography (echo) is the first imaging modality used when assessing cardiac function. The measurement of functional biomarkers from echo relies upon the segmentation of cardiac structures and deep learning models have been proposed to automate the segmentation process. However, in order to translate these tools to widespread clinical use it is important that the segmentation models are robust to a wide variety of images (e.g. acquired from different scanners, by operators with different levels of expertise etc.). To achieve this level of robustness it is necessary that the models are trained with multiple diverse datasets. A significant challenge faced when training with multiple diverse datasets is the variation in label presence, i.e. the combined data are often partially-labelled. Adaptations of the cross entropy loss function have been proposed to deal with partially labelled data. In this paper we show that training naively with such a loss function and multiple diverse datasets can lead to a form of shortcut learning, where the model associates label presence with domain characteristics, leading to a drop in performance. To address this problem, we propose a novel label dropout scheme to break the link between domain characteristics and the presence or absence of labels. We demonstrate that label dropout improves echo segmentation Dice score by 62% and 25% on two cardiac structures when training using multiple diverse partially labelled datasets.
Abstract:A unified self-supervised and supervised deep learning framework for PET image reconstruction is presented, including deep-learned filtered backprojection (DL-FBP) for sinograms, deep-learned backproject then filter (DL-BPF) for backprojected images, and a more general mapping using a deep network in both the sinogram and image domains (DL-FBP-F). The framework allows varying amounts and types of training data, from the case of having only one single dataset to reconstruct through to the case of having numerous measured datasets, which may or may not be paired with high-quality references. For purely self-supervised mappings, no reference or ground truth data are needed. The self-supervised deep-learned reconstruction operators all use a conventional image reconstruction objective within the loss function (e.g. maximum Poisson likelihood, maximum a posteriori). If it is desired for the reconstruction networks to generalise (i.e. to need either no or minimal retraining for a new measured dataset, but to be fast, ready to reuse), then these self-supervised networks show potential even when previously trained from just one single dataset. For any given new measured dataset, finetuning is however often necessary, and of course the initial training set should ideally go beyond just one dataset if a generalisable network is sought. Example results for the purely self-supervised single-dataset case are shown, but the networks can be i) trained uniquely for any measured dataset to reconstruct from, ii) pretrained on multiple datasets and then used with no retraining for new measured data, iii) pretrained and then finetuned for new measured data, iv) optionally trained with high-quality references. The framework, with its optional inclusion of supervised learning, provides a spectrum of reconstruction approaches by making use of whatever (if any) training data quantities and types are available.