Abstract:This paper presents a comprehensive study focused on disentangling hippocampal shape variations from diffusion tensor imaging (DTI) datasets within the context of neurological disorders. Leveraging a Graph Variational Autoencoder (VAE) enhanced with Supervised Contrastive Learning, our approach aims to improve interpretability by disentangling two distinct latent variables corresponding to age and the presence of diseases. In our ablation study, we investigate a range of VAE architectures and contrastive loss functions, showcasing the enhanced disentanglement capabilities of our approach. This evaluation uses synthetic 3D torus mesh data and real 3D hippocampal mesh datasets derived from the DTI hippocampal dataset. Our supervised disentanglement model outperforms several state-of-the-art (SOTA) methods like attribute and guided VAEs in terms of disentanglement scores. Our model distinguishes between age groups and disease status in patients with Multiple Sclerosis (MS) using the hippocampus data. Our Graph VAE with Supervised Contrastive Learning shows the volume changes of the hippocampus of MS populations at different ages, and the result is consistent with the current neuroimaging literature. This research provides valuable insights into the relationship between neurological disorder and hippocampal shape changes in different age groups of MS populations using a Graph VAE with Supervised Contrastive loss.
Abstract:The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
Abstract:Estimating the uncertainty in image registration is an area of current research that is aimed at providing information that will enable surgeons to assess the operative risk based on registered image data and the estimated registration uncertainty. If they receive inaccurately calculated registration uncertainty and misplace confidence in the alignment solutions, severe consequences may result. For probabilistic image registration (PIR), most research quantifies the registration uncertainty using summary statistics of the transformation distributions. In this paper, we study a rarely examined topic: whether those summary statistics of the transformation distribution truly represent the registration uncertainty. Using concrete examples, we show that there are two types of uncertainties: the transformation uncertainty, Ut, and label uncertainty Ul. Ut indicates the doubt concerning transformation parameters and can be estimated by conventional uncertainty measures, while Ul is strongly linked to the goal of registration. Further, we show that using Ut to quantify Ul is inappropriate and can be misleading. In addition, we present some potentially critical findings regarding PIR.
Abstract:We propose an end-to-end neural network that improves the segmentation accuracy of fully convolutional networks by incorporating a localization unit. This network performs object localization first, which is then used as a cue to guide the training of the segmentation network. We test the proposed method on a segmentation task of small objects on a clinical dataset of ultrasound images. We show that by jointly learning for detection and segmentation, the proposed network is able to improve the segmentation accuracy compared to only learning for segmentation.
Abstract:We propose an attention mechanism for 3D medical image segmentation. The method, named segmentation-by-detection, is a cascade of a detection module followed by a segmentation module. The detection module enables a region of interest to come to attention and produces a set of object region candidates which are further used as an attention model. Rather than dealing with the entire volume, the segmentation module distills the information from the potential region. This scheme is an efficient solution for volumetric data as it reduces the influence of the surrounding noise which is especially important for medical data with low signal-to-noise ratio. Experimental results on 3D ultrasound data of the femoral head shows superiority of the proposed method when compared with a standard fully convolutional network like the U-Net.
Abstract:This paper proposes a novel image segmentation approachthat integrates fully convolutional networks (FCNs) with a level setmodel. Compared with a FCN, the integrated method can incorporatesmoothing and prior information to achieve an accurate segmentation.Furthermore, different than using the level set model as a post-processingtool, we integrate it into the training phase to fine-tune the FCN. Thisallows the use of unlabeled data during training in a semi-supervisedsetting. Using two types of medical imaging data (liver CT and left ven-tricle MRI data), we show that the integrated method achieves goodperformance even when little training data is available, outperformingthe FCN or the level set model alone.
Abstract:Being a task of establishing spatial correspondences, medical image registration is often formalized as finding the optimal transformation that best aligns two images. Since the transformation is such an essential component of registration, most existing researches conventionally quantify the registration uncertainty, which is the confidence in the estimated spatial correspondences, by the transformation uncertainty. In this paper, we give concrete examples and reveal that using the transformation uncertainty to quantify the registration uncertainty is inappropriate and sometimes misleading. Based on this finding, we also raise attention to an important yet subtle aspect of probabilistic image registration, that is whether it is reasonable to determine the correspondence of a registered voxel solely by the mode of its transformation distribution.
Abstract:Probabilistic image registration methods estimate the posterior distribution of transformation. The conventional way of interpreting the transformation posterior is to use the mode as the most likely transformation and assign its corresponding intensity to the registered voxel. Meanwhile, summary statistics of the posterior are employed to evaluate the registration uncertainty, that is the trustworthiness of the registered image. Despite the wide acceptance, this convention has never been justified. In this paper, based on illustrative examples, we question the correctness and usefulness of conventional methods. In order to faithfully translate the transformation posterior, we propose to encode the variability of values into a novel data type called ensemble fields. Ensemble fields can serve as a complement to the registered image and a foundation for developing advanced methods to characterize the uncertainty in registration-based tasks. We demonstrate the potential of ensemble fields by pilot examples