Abstract:This work presents a study on label noise in medical image segmentation by considering a noise model based on Gaussian field deformations. Such noise is of interest because it yields realistic looking segmentations and because it is unbiased in the sense that the expected deformation is the identity mapping. Efficient methods for sampling and closed form solutions for the marginal probabilities are provided. Moreover, theoretically optimal solutions to the loss functions cross-entropy and soft-Dice are studied and it is shown how they diverge as the level of noise increases. Based on recent work on loss function characterization, it is shown that optimal solutions to soft-Dice can be recovered by thresholding solutions to cross-entropy with a particular a priori unknown threshold that efficiently can be computed. This raises the question whether the decrease in performance seen when using cross-entropy as compared to soft-Dice is caused by using the wrong threshold. The hypothesis is validated in 5-fold studies on three organ segmentation problems from the TotalSegmentor data set, using 4 different strengths of noise. The results show that changing the threshold leads the performance of cross-entropy to go from systematically worse than soft-Dice to similar or better results than soft-Dice.
Abstract:This paper presents a study on the soft-Dice loss, one of the most popular loss functions in medical image segmentation, for situations where noise is present in target labels. In particular, the set of optimal solutions are characterized and sharp bounds on the volume bias of these solutions are provided. It is further shown that a sequence of soft segmentations converging to optimal soft-Dice also converges to optimal Dice when converted to hard segmentations using thresholding. This is an important result because soft-Dice is often used as a proxy for maximizing the Dice metric. Finally, experiments confirming the theoretical results are provided.
Abstract:We study two of the most popular performance metrics in medical image segmentation, Accuracy and Dice, when the target labels are noisy. For both metrics, several statements related to characterization and volume properties of the set of optimal segmentations are proved, and associated experiments are provided. Our main insights are: (i) the volume of the solutions to both metrics may deviate significantly from the expected volume of the target, (ii) the volume of a solution to Accuracy is always less than or equal to the volume of a solution to Dice and (iii) the optimal solutions to both of these metrics coincide when the set of feasible segmentations is constrained to the set of segmentations with the volume equal to the expected volume of the target.
Abstract:Clustering using deep neural network models have been extensively studied in recent years. Among the most popular frameworks are the VAE and GAN frameworks, which learns latent feature representations of data through encoder / decoder neural net structures. This is a suitable base for clustering tasks, as the latent space often seems to effectively capture the inherent essence of data, simplifying its manifold and reducing noise. In this article, the VAE framework is used to investigate how probability function gradient ascent over data points can be used to process data in order to achieve better clustering. Improvements in classification is observed comparing with unprocessed data, although state of the art results are not obtained. Processing data with gradient descent however results in more distinct cluster separation, making it simpler to investigate suitable hyper parameter settings such as the number of clusters. We propose a simple yet effective method for investigating suitable number of clusters for data, based on the DBSCAN clustering algorithm, and demonstrate that cluster number determination is facilitated with gradient processing. As an additional curiosity, we find that our baseline model used for comparison; a GMM on a t-SNE latent space for a VAE structure with weight one on reconstruction during training (autoencoder), yield state of the art results on the MNIST data, to our knowledge not beaten by any other existing model.
Abstract:Auto encoding models have been extensively studied in recent years. They provide an efficient framework for sample generation, as well as for analysing feature learning. Furthermore, they are efficient in performing interpolations between data-points in semantically meaningful ways. In this paper, we build further on a previously introduced method for generating canonical, dimension independent, stochastic interpolations. Here, the distribution of interpolation paths is represented as the distribution of a bridge process constructed from an artificial random data generating process in the latent space, having the prior distribution as its invariant distribution. As a result the stochastic interpolation paths tend to reside in regions of the latent space where the prior has high mass. This is a desirable feature since, generally, such areas produce semantically meaningful samples. In this paper, we extend the bridge process method by introducing a discriminator network that accurately identifies areas of high latent representation density. The discriminator network is incorporated as a change of measure of the underlying bridge process and sampling of interpolation paths is implemented using sequential Monte Carlo. The resulting sampling procedure allows for greater variability in interpolation paths and stronger drift towards areas of high data density.