Abstract:A critical challenge for tumour segmentation models is the ability to adapt to diverse clinical settings, particularly when applied to poor-quality neuroimaging data. The uncertainty surrounding this adaptation stems from the lack of representative datasets, leaving top-performing models without exposure to common artifacts found in MRI data throughout Sub-Saharan Africa (SSA). We replicated a framework that secured the 2nd position in the 2022 BraTS competition to investigate the impact of dataset composition on model performance and pursued four distinct approaches through training a model with: 1) BraTS-Africa data only (train_SSA, N=60), 2) BraTS-Adult Glioma data only (train_GLI, N=1251), 3) both datasets together (train_ALL, N=1311), and 4) through further training the train_GLI model with BraTS-Africa data (train_ftSSA). Notably, training on a smaller low-quality dataset alone (train_SSA) yielded subpar results, and training on a larger high-quality dataset alone (train_GLI) struggled to delineate oedematous tissue in the low-quality validation set. The most promising approach (train_ftSSA) involved pre-training a model on high-quality neuroimages and then fine-tuning it on the smaller, low-quality dataset. This approach outperformed the others, ranking second in the MICCAI BraTS Africa global challenge external testing phase. These findings underscore the significance of larger sample sizes and broad exposure to data in improving segmentation performance. Furthermore, we demonstrated that there is potential for improving such models by fine-tuning them with a wider range of data locally.
Abstract:Major depressive disorder (MDD) is a complex psychiatric disorder that affects the lives of hundreds of millions of individuals around the globe. Even today, researchers debate if morphological alterations in the brain are linked to MDD, likely due to the heterogeneity of this disorder. The application of deep learning tools to neuroimaging data, capable of capturing complex non-linear patterns, has the potential to provide diagnostic and predictive biomarkers for MDD. However, previous attempts to demarcate MDD patients and healthy controls (HC) based on segmented cortical features via linear machine learning approaches have reported low accuracies. In this study, we used globally representative data from the ENIGMA-MDD working group containing an extensive sample of people with MDD (N=2,772) and HC (N=4,240), which allows a comprehensive analysis with generalizable results. Based on the hypothesis that integration of vertex-wise cortical features can improve classification performance, we evaluated the classification of a DenseNet and a Support Vector Machine (SVM), with the expectation that the former would outperform the latter. As we analyzed a multi-site sample, we additionally applied the ComBat harmonization tool to remove potential nuisance effects of site. We found that both classifiers exhibited close to chance performance (balanced accuracy DenseNet: 51%; SVM: 53%), when estimated on unseen sites. Slightly higher classification performance (balanced accuracy DenseNet: 58%; SVM: 55%) was found when the cross-validation folds contained subjects from all sites, indicating site effect. In conclusion, the integration of vertex-wise morphometric features and the use of the non-linear classifier did not lead to the differentiability between MDD and HC. Our results support the notion that MDD classification on this combination of features and classifiers is unfeasible.