Abstract:Background. Federated learning (FL) has gained wide popularity as a collaborative learning paradigm enabling collaborative AI in sensitive healthcare applications. Nevertheless, the practical implementation of FL presents technical and organizational challenges, as it generally requires complex communication infrastructures. In this context, consensus-based learning (CBL) may represent a promising collaborative learning alternative, thanks to the ability of combining local knowledge into a federated decision system, while potentially reducing deployment overhead. Methods. In this work we propose an extensive benchmark of the accuracy and cost-effectiveness of a panel of FL and CBL methods in a wide range of collaborative medical data analysis scenarios. The benchmark includes 7 different medical datasets, encompassing 3 machine learning tasks, 8 different data modalities, and multi-centric settings involving 3 to 23 clients. Findings. Our results reveal that CBL is a cost-effective alternative to FL. When compared across the panel of medical dataset in the considered benchmark, CBL methods provide equivalent accuracy to the one achieved by FL.Nonetheless, CBL significantly reduces training time and communication cost (resp. 15 fold and 60 fold decrease) (p < 0.05). Interpretation. This study opens a novel perspective on the deployment of collaborative AI in real-world applications, whereas the adoption of cost-effective methods is instrumental to achieve sustainability and democratisation of AI by alleviating the need for extensive computational resources.
Abstract:Early and accurate diagnosis of Parkinson's Disease (PD) remains challenging. This study compares deep learning architectures for MRI-based PD classification, introducing the first three-dimensional (3D) implementation of Convolutional Kolmogorov-Arnold Networks (ConvKANs), a new approach that combines convolution layers with adaptive, spline-based activations. We evaluated Convolutional Neural Networks (CNNs), ConvKANs, and Graph Convolutional Networks (GCNs) using three open-source datasets; a total of 142 participants (75 with PD and 67 age-matched healthy controls). For 2D analysis, we extracted 100 axial slices centred on the midbrain from each T1-weighted scan. For 3D analysis, we used the entire volumetric scans. ConvKANs integrate learnable B-spline functions with convolutional layers. GCNs represent MRI data as graphs, theoretically capturing structural relationships that may be overlooked by traditional approaches. Interpretability visualizations, including the first ConvKAN spline activation maps, and projections of graph node embeddings, were depicted. ConvKANs demonstrated high performance across datasets and dimensionalities, achieving the highest 2D AUROC (0.98) in one dataset and matching CNN peak 3D performance (1.00). CNN models performed well, while GCN models improved in 3D analyses, reaching up to 0.97 AUROC. 3D implementations yielded higher AUROC values compared to 2D counterparts across all models. ConvKAN implementation shows promise for MRI analysis in PD classification, particularly in the context of early diagnosis. The improvement in 3D analyses highlights the value of volumetric data in capturing subtle PD-related changes. While MRI is not currently used for PD diagnosis, these findings suggest its potential as a component of a multimodal diagnostic approach, especially for early detection.
Abstract:Healthcare data is often split into medium/small-sized collections across multiple hospitals and access to it is encumbered by privacy regulations. This brings difficulties to use them for the development of machine learning and deep learning models, which are known to be data-hungry. One way to overcome this limitation is to use collaborative learning (CL) methods, which allow hospitals to work collaboratively to solve a task, without the need to explicitly share local data. In this paper, we address a prostate segmentation problem from MRI in a collaborative scenario by comparing two different approaches: federated learning (FL) and consensus-based methods (CBM). To the best of our knowledge, this is the first work in which CBM, such as label fusion techniques, are used to solve a problem of collaborative learning. In this setting, CBM combine predictions from locally trained models to obtain a federated strong learner with ideally improved robustness and predictive variance properties. Our experiments show that, in the considered practical scenario, CBMs provide equal or better results than FL, while being highly cost-effective. Our results demonstrate that the consensus paradigm may represent a valid alternative to FL for typical training tasks in medical imaging.
Abstract:Cancer is a highly heterogeneous condition that can occur almost anywhere in the human body. 18F-fluorodeoxyglucose is an imaging modality commonly used to detect cancer due to its high sensitivity and clear visualisation of the pattern of metabolic activity. Nonetheless, as cancer is highly heterogeneous, it is challenging to train general-purpose discriminative cancer detection models, with data availability and disease complexity often cited as a limiting factor. Unsupervised anomaly detection models have been suggested as a putative solution. These models learn a healthy representation of tissue and detect cancer by predicting deviations from the healthy norm, which requires models capable of accurately learning long-range interactions between organs and their imaging patterns with high levels of expressivity. Such characteristics are suitably satisfied by transformers, which have been shown to generate state-of-the-art results in unsupervised anomaly detection by training on normal data. This work expands upon such approaches by introducing multi-modal conditioning of the transformer via cross-attention i.e. supplying anatomical reference from paired CT. Using 294 whole-body PET/CT samples, we show that our anomaly detection method is robust and capable of achieving accurate cancer localization results even in cases where normal training data is unavailable. In addition, we show the efficacy of this approach on out-of-sample data showcasing the generalizability of this approach with limited training data. Lastly, we propose to combine model uncertainty with a new kernel density estimation approach, and show that it provides clinically and statistically significant improvements when compared to the classic residual-based anomaly maps. Overall, a superior performance is demonstrated against leading state-of-the-art alternatives, drawing attention to the potential of these approaches.
Abstract:Understanding the internal physiological changes accompanying the aging process is an important aspect of medical image interpretation, with the expected changes acting as a baseline when reporting abnormal findings. Deep learning has recently been demonstrated to allow the accurate estimation of patient age from chest X-rays, and shows potential as a health indicator and mortality predictor. In this paper we present a novel comparative study of the relative performance of radiologists versus state-of-the-art deep learning models on two tasks: (a) patient age estimation from a single chest X-ray, and (b) ranking of two time-separated images of the same patient by age. We train our models with a heterogeneous database of 1.8M chest X-rays with ground truth patient ages and investigate the limitations on model accuracy imposed by limited training data and image resolution, and demonstrate generalisation performance on public data. To explore the large performance gap between the models and humans on these age-prediction tasks compared with other radiological reporting tasks seen in the literature, we incorporate our age prediction model into a conditional Generative Adversarial Network (cGAN) allowing visualisation of the semantic features identified by the prediction model as significant to age prediction, comparing the identified features with those relied on by clinicians.
Abstract:Machine learning approaches hold great potential for the automated detection of lung nodules in chest radiographs, but training the algorithms requires vary large amounts of manually annotated images, which are difficult to obtain. Weak labels indicating whether a radiograph is likely to contain pulmonary nodules are typically easier to obtain at scale by parsing historical free-text radiological reports associated to the radiographs. Using a repositotory of over 700,000 chest radiographs, in this study we demonstrate that promising nodule detection performance can be achieved using weak labels through convolutional neural networks for radiograph classification. We propose two network architectures for the classification of images likely to contain pulmonary nodules using both weak labels and manually-delineated bounding boxes, when these are available. Annotated nodules are used at training time to deliver a visual attention mechanism informing the model about its localisation performance. The first architecture extracts saliency maps from high-level convolutional layers and compares the estimated position of a nodule against the ground truth, when this is available. A corresponding localisation error is then back-propagated along with the softmax classification error. The second approach consists of a recurrent attention model that learns to observe a short sequence of smaller image portions through reinforcement learning. When a nodule annotation is available at training time, the reward function is modified accordingly so that exploring portions of the radiographs away from a nodule incurs a larger penalty. Our empirical results demonstrate the potential advantages of these architectures in comparison to competing methodologies.