Danish Research Centre for Magnetic Resonance, Copenhagen University Hospital Hvidovre, Denmark
Abstract:Domain randomization through synthesis is a powerful strategy to train networks that are unbiased with respect to the domain of the input images. Randomization allows networks to see a virtually infinite range of intensities and artifacts during training, thereby minimizing overfitting to appearance and maximizing generalization to unseen data. While powerful, this approach relies on the accurate tuning of a large set of hyper-parameters governing the probabilistic distribution of the synthesized images. Instead of manually tuning these parameters, we introduce Learn2Synth, a novel procedure in which synthesis parameters are learned using a small set of real labeled data. Unlike methods that impose constraints to align synthetic data with real data (e.g., contrastive or adversarial techniques), which risk misaligning the image and its label map, we tune an augmentation engine such that a segmentation network trained on synthetic data has optimal accuracy when applied to real data. This approach allows the training procedure to benefit from real labeled examples, without ever using these real examples to train the segmentation network, which avoids biasing the network towards the properties of the training set. Specifically, we develop both parametric and nonparametric strategies to augment the synthetic images, enhancing the segmentation network's performance. Experimental results on both synthetic and real-world datasets demonstrate the effectiveness of this learning strategy. Code is available at: https://github.com/HuXiaoling/Learn2Synth.
Abstract:Surface-based analysis of the cerebral cortex is ubiquitous in human neuroimaging with MRI. It is crucial for cortical registration, parcellation, and thickness estimation. Traditionally, these analyses require high-resolution, isotropic scans with good gray-white matter contrast, typically a 1mm T1-weighted scan. This excludes most clinical MRI scans, which are often anisotropic and lack the necessary T1 contrast. To enable large-scale neuroimaging studies using vast clinical data, we introduce recon-all-clinical, a novel method for cortical reconstruction, registration, parcellation, and thickness estimation in brain MRI scans of any resolution and contrast. Our approach employs a hybrid analysis method that combines a convolutional neural network (CNN) trained with domain randomization to predict signed distance functions (SDFs) and classical geometry processing for accurate surface placement while maintaining topological and geometric constraints. The method does not require retraining for different acquisitions, thus simplifying the analysis of heterogeneous clinical datasets. We tested recon-all-clinical on multiple datasets, including over 19,000 clinical scans. The method consistently produced precise cortical reconstructions and high parcellation accuracy across varied MRI contrasts and resolutions. Cortical thickness estimates are precise enough to capture aging effects independently of MRI contrast, although accuracy varies with slice thickness. Our method is publicly available at https://surfer.nmr.mgh.harvard.edu/fswiki/recon-all-clinical, enabling researchers to perform detailed cortical analysis on the huge amounts of already existing clinical MRI scans. This advancement may be particularly valuable for studying rare diseases and underrepresented populations where research-grade MRI data is scarce.
Abstract:Segmentation of brain structures on magnetic resonance imaging (MRI) is a highly relevant neuroimaging topic, as it is a prerequisite for different analyses such as volumetry or shape analysis. Automated segmentation facilitates the study of brain structures in larger cohorts when compared with manual segmentation, which is time-consuming. However, the development of most automated methods relies on large and manually annotated datasets, which limits the generalizability of these methods. Recently, new techniques using synthetic images have emerged, reducing the need for manual annotation. Here we provide HELM, Hypothalamic ex vivo Label Maps, a dataset composed of label maps built from publicly available ultra-high resolution ex vivo MRI from 10 whole hemispheres, which can be used to develop segmentation methods using synthetic data. The label maps are obtained with a combination of manual labels for the hypothalamic regions and automated segmentations for the rest of the brain, and mirrored to simulate entire brains. We also provide the pre-processed ex vivo scans, as this dataset can support future projects to include other structures after these are manually segmented.
Abstract:In human neuroimaging studies, atlas registration enables mapping MRI scans to a common coordinate frame, which is necessary to aggregate data from multiple subjects. Machine learning registration methods have achieved excellent speed and accuracy but lack interpretability. More recently, keypoint-based methods have been proposed to tackle this issue, but their accuracy is still subpar, particularly when fitting nonlinear transforms. Here we propose Registration by Regression (RbR), a novel atlas registration framework that is highly robust and flexible, conceptually simple, and can be trained with cheaply obtained data. RbR predicts the (x,y,z) atlas coordinates for every voxel of the input scan (i.e., every voxel is a keypoint), and then uses closed-form expressions to quickly fit transforms using a wide array of possible deformation models, including affine and nonlinear (e.g., Bspline, Demons, invertible diffeomorphic models, etc.). Robustness is provided by the large number of voxels informing the registration and can be further increased by robust estimators like RANSAC. Experiments on independent public datasets show that RbR yields more accurate registration than competing keypoint approaches, while providing full control of the deformation model.
Abstract:White matter hyperintensities (WMH) are a hallmark of cerebrovascular disease and multiple sclerosis. Automated WMH segmentation methods enable quantitative analysis via estimation of total lesion load, spatial distribution of lesions, and number of lesions (i.e., number of connected components after thresholding), all of which are correlated with patient outcomes. While the two former measures can generally be estimated robustly, the number of lesions is highly sensitive to noise and segmentation mistakes -- even when small connected components are eroded or disregarded. In this article, we present P-Count, an algebraic WMH counting tool based on persistent homology that accounts for the topological features of WM lesions in a robust manner. Using computational geometry, P-Count takes the persistence of connected components into consideration, effectively filtering out the noisy WMH positives, resulting in a more accurate count of true lesions. We validated P-Count on the ISBI2015 longitudinal lesion segmentation dataset, where it produces significantly more accurate results than direct thresholding.
Abstract:Remarkable progress has been made by data-driven machine-learning methods in the analysis of MRI scans. However, most existing MRI analysis approaches are crafted for specific MR pulse sequences (MR contrasts) and usually require nearly isotropic acquisitions. This limits their applicability to diverse real-world clinical data, where scans commonly exhibit variations in appearances due to being obtained with varying sequence parameters, resolutions, and orientations -- especially in the presence of pathology. In this paper, we propose PEPSI, the first pathology-enhanced, and pulse-sequence-invariant feature representation learning model for brain MRI. PEPSI is trained entirely on synthetic images with a novel pathology encoding strategy, and enables co-training across datasets with diverse pathologies and missing modalities. Despite variations in pathology appearances across different MR pulse sequences or the quality of acquired images (e.g., resolution, orientation, artifacts, etc), PEPSI produces a high-resolution image of reference contrast (MP-RAGE) that captures anatomy, along with an image specifically highlighting the pathology. Our experiments demonstrate PEPSI's remarkable capability for image synthesis compared with the state-of-the-art, contrast-agnostic synthesis models, as it accurately reconstructs anatomical structures while differentiating between pathology and normal tissue. We further illustrate the efficiency and effectiveness of PEPSI features for downstream pathology segmentations on five public datasets covering white matter hyperintensities and stroke lesions. Code is available at https://github.com/peirong26/PEPSI.
Abstract:Purpose: To develop a method for automated segmentation of hypothalamus subregions informed by ultra-high resolution ex vivo magnetic resonance images (MRI), which generalizes across MRI sequences and resolutions without retraining. Materials and Methods: We trained our deep learning method, H-synEx, with synthetic images derived from label maps built from ultra-high resolution ex vivo MRI scans, which enables finer-grained manual segmentation when compared with 1mm isometric in vivo images. We validated this retrospective study using 1535 in vivo images from six datasets and six MRI sequences. The quantitative evaluation used the Dice Coefficient (DC) and Average Hausdorff distance (AVD). Statistical analysis compared hypothalamic subregion volumes in controls, Alzheimer's disease (AD), and behavioral variant frontotemporal dementia (bvFTD) subjects using the area under the curve (AUC) and Wilcoxon rank sum test. Results: H-SynEx can segment the hypothalamus across various MRI sequences, encompassing FLAIR sequences with significant slice spacing (5mm). Using hypothalamic volumes on T1w images to distinguish control from AD and bvFTD patients, we observed AUC values of 0.74 and 0.79 respectively. Additionally, AUC=0.66 was found for volume variation on FLAIR scans when comparing control and non-patients. Conclusion: Our results show that H-SynEx successfully leverages information from ultra-high resolution scans to segment in vivo from different MRI sequences such as T1w, T2w, PD, qT1, FA, and FLAIR. We also found that our automated segmentation was able to discriminate controls versus patients on FLAIR images with 5mm spacing. H-SynEx is openly available at https://github.com/liviamarodrigues/hsynex.
Abstract:Brain atrophy and white matter hyperintensity (WMH) are critical neuroimaging features for ascertaining brain injury in cerebrovascular disease and multiple sclerosis. Automated segmentation and quantification is desirable but existing methods require high-resolution MRI with good signal-to-noise ratio (SNR). This precludes application to clinical and low-field portable MRI (pMRI) scans, thus hampering large-scale tracking of atrophy and WMH progression, especially in underserved areas where pMRI has huge potential. Here we present a method that segments white matter hyperintensity and 36 brain regions from scans of any resolution and contrast (including pMRI) without retraining. We show results on six public datasets and on a private dataset with paired high- and low-field scans (3T and 64mT), where we attain strong correlation between the WMH ($\rho$=.85) and hippocampal volumes (r=.89) estimated at both fields. Our method is publicly available as part of FreeSurfer, at: http://surfer.nmr.mgh.harvard.edu/fswiki/WMH-SynthSeg.
Abstract:Recent learning-based approaches have made astonishing advances in calibrated medical imaging like computerized tomography, yet they struggle to generalize in uncalibrated modalities -- notoriously magnetic resonance imaging (MRI), where performance is highly sensitive to the differences in MR contrast, resolution, and orientation between the training and testing data. This prevents broad applicability to the diverse clinical acquisition protocols in the real world. We introduce Brain-ID, a robust feature representation learning strategy for brain imaging, which is contrast-agnostic, and robust to the brain anatomy of each subject regardless of the appearance of acquired images (i.e., deformation, contrast, resolution, orientation, artifacts, etc). Brain-ID is trained entirely on synthetic data, and easily adapts to downstream tasks with our proposed simple one-layer solution. We validate the robustness of Brain-ID features, and evaluate their performance in a variety of downstream applications, including both contrast-independent (anatomy reconstruction/contrast synthesis, brain segmentation), and contrast-dependent (super-resolution, bias field estimation) tasks. Extensive experiments on 6 public datasets demonstrate that Brain-ID achieves state-of-the-art performance in all tasks, and more importantly, preserves its performance when only limited training data is available.
Abstract:Recent years have seen a growing interest in methods for predicting a variable of interest, such as a subject's diagnosis, from medical images. Methods based on discriminative modeling excel at making accurate predictions, but are challenged in their ability to explain their decisions in anatomically meaningful terms. In this paper, we propose a simple technique for single-subject prediction that is inherently interpretable. It augments the generative models used in classical human brain mapping techniques, in which cause-effect relations can be encoded, with a multivariate noise model that captures dominant spatial correlations. Experiments demonstrate that the resulting model can be efficiently inverted to make accurate subject-level predictions, while at the same time offering intuitive causal explanations of its inner workings. The method is easy to use: training is fast for typical training set sizes, and only a single hyperparameter needs to be set by the user. Our code is available at https://github.com/chiara-mauri/Interpretable-subject-level-prediction.