Abstract:Annotating lots of 3D medical images for training segmentation models is time-consuming. The goal of weakly supervised semantic segmentation is to train segmentation models without using any ground truth segmentation masks. Our work addresses the case where only image-level categorical labels, indicating the presence or absence of a particular region of interest (such as tumours or lesions), are available. Most existing methods rely on class activation mapping (CAM). We propose a novel approach, ToNNO, which is based on the Tomographic reconstruction of a Neural Network's Output. Our technique extracts stacks of slices with different angles from the input 3D volume, feeds these slices to a 2D encoder, and applies the inverse Radon transform in order to reconstruct a 3D heatmap of the encoder's predictions. This generic method allows to perform dense prediction tasks on 3D volumes using any 2D image encoder. We apply it to weakly supervised medical image segmentation by training the 2D encoder to output high values for slices containing the regions of interest. We test it on four large scale medical image datasets and outperform 2D CAM methods. We then extend ToNNO by combining tomographic reconstruction with CAM methods, proposing Averaged CAM and Tomographic CAM, which obtain even better results.
Abstract:Choroid plexuses (CP) are structures of the ventricles of the brain which produce most of the cerebrospinal fluid (CSF). Several postmortem and in vivo studies have pointed towards their role in the inflammatory process in multiple sclerosis (MS). Automatic segmentation of CP from MRI thus has high value for studying their characteristics in large cohorts of patients. To the best of our knowledge, the only freely available tool for CP segmentation is FreeSurfer but its accuracy for this specific structure is poor. In this paper, we propose to automatically segment CP from non-contrast enhanced T1-weighted MRI. To that end, we introduce a new model called "Axial-MLP" based on an assembly of Axial multi-layer perceptrons (MLPs). This is inspired by recent works which showed that the self-attention layers of Transformers can be replaced with MLPs. This approach is systematically compared with a standard 3D U-Net, nnU-Net, Freesurfer and FastSurfer. For our experiments, we make use of a dataset of 141 subjects (44 controls and 97 patients with MS). We show that all the tested deep learning (DL) methods outperform FreeSurfer (Dice around 0.7 for DL vs 0.33 for FreeSurfer). Axial-MLP is competitive with U-Nets even though it is slightly less accurate. The conclusions of our paper are two-fold: 1) the studied deep learning methods could be useful tools to study CP in large cohorts of MS patients; 2)~Axial-MLP is a potentially viable alternative to convolutional neural networks for such tasks, although it could benefit from further improvements.