Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Eli Gibson

AdaViT: Adaptive Vision Transformer for Flexible Pretrain and Finetune with Variable 3D Medical Image Modalities

Apr 04, 2025

Badhan Kumar Das, Gengyan Zhao, Han Liu, Thomas J. Re, Dorin Comaniciu, Eli Gibson, Andreas Maier

Abstract:Pretrain techniques, whether supervised or self-supervised, are widely used in deep learning to enhance model performance. In real-world clinical scenarios, different sets of magnetic resonance (MR) contrasts are often acquired for different subjects/cases, creating challenges for deep learning models assuming consistent input modalities among all the cases and between pretrain and finetune. Existing methods struggle to maintain performance when there is an input modality/contrast set mismatch with the pretrained model, often resulting in degraded accuracy. We propose an adaptive Vision Transformer (AdaViT) framework capable of handling variable set of input modalities for each case. We utilize a dynamic tokenizer to encode different input image modalities to tokens and take advantage of the characteristics of the transformer to build attention mechanism across variable length of tokens. Through extensive experiments, we demonstrate that this architecture effectively transfers supervised pretrained models to new datasets with different input modality/contrast sets, resulting in superior performance on zero-shot testing, few-shot finetuning, and backward transferring in brain infarct and brain tumor segmentation tasks. Additionally, for self-supervised pretrain, the proposed method is able to maximize the pretrain data and facilitate transferring to diverse downstream tasks with variable sets of input modalities.

Via

Access Paper or Ask Questions

A Non-contrast Head CT Foundation Model for Comprehensive Neuro-Trauma Triage

Feb 28, 2025

Youngjin Yoo, Bogdan Georgescu, Yanbo Zhang, Sasa Grbic, Han Liu, Gabriela D. Aldea, Thomas J. Re, Jyotipriya Das, Poikavila Ullaskrishnan, Eva Eibenberger(+7 more)

Figure 1 for A Non-contrast Head CT Foundation Model for Comprehensive Neuro-Trauma Triage

Figure 2 for A Non-contrast Head CT Foundation Model for Comprehensive Neuro-Trauma Triage

Figure 3 for A Non-contrast Head CT Foundation Model for Comprehensive Neuro-Trauma Triage

Figure 4 for A Non-contrast Head CT Foundation Model for Comprehensive Neuro-Trauma Triage

Abstract:Recent advancements in AI and medical imaging offer transformative potential in emergency head CT interpretation for reducing assessment times and improving accuracy in the face of an increasing request of such scans and a global shortage in radiologists. This study introduces a 3D foundation model for detecting diverse neuro-trauma findings with high accuracy and efficiency. Using large language models (LLMs) for automatic labeling, we generated comprehensive multi-label annotations for critical conditions. Our approach involved pretraining neural networks for hemorrhage subtype segmentation and brain anatomy parcellation, which were integrated into a pretrained comprehensive neuro-trauma detection network through multimodal fine-tuning. Performance evaluation against expert annotations and comparison with CT-CLIP demonstrated strong triage accuracy across major neuro-trauma findings, such as hemorrhage and midline shift, as well as less frequent critical conditions such as cerebral edema and arterial hyperdensity. The integration of neuro-specific features significantly enhanced diagnostic capabilities, achieving an average AUC of 0.861 for 16 neuro-trauma conditions. This work advances foundation models in medical imaging, serving as a benchmark for future AI-assisted neuro-trauma diagnostics in emergency radiology.

Via

Access Paper or Ask Questions

Self Pre-training with Adaptive Mask Autoencoders for Variable-Contrast 3D Medical Imaging

Jan 15, 2025

Badhan Kumar Das, Gengyan Zhao, Han Liu, Thomas J. Re, Dorin Comaniciu, Eli Gibson, Andreas Maier

Abstract:The Masked Autoencoder (MAE) has recently demonstrated effectiveness in pre-training Vision Transformers (ViT) for analyzing natural images. By reconstructing complete images from partially masked inputs, the ViT encoder gathers contextual information to predict the missing regions. This capability to aggregate context is especially important in medical imaging, where anatomical structures are functionally and mechanically linked to surrounding regions. However, current methods do not consider variations in the number of input images, which is typically the case in real-world Magnetic Resonance (MR) studies. To address this limitation, we propose a 3D Adaptive Masked Autoencoders (AMAE) architecture that accommodates a variable number of 3D input contrasts per subject. A magnetic resonance imaging (MRI) dataset of 45,364 subjects was used for pretraining and a subset of 1648 training, 193 validation and 215 test subjects were used for finetuning. The performance demonstrates that self pre-training of this adaptive masked autoencoders can enhance the infarct segmentation performance by 2.8%-3.7% for ViT-based segmentation models.

* 5 pages, ISBI 2025 accepted

Via

Access Paper or Ask Questions

Quantifying and Leveraging Predictive Uncertainty for Medical Image Assessment

Jul 08, 2020

Florin C. Ghesu, Bogdan Georgescu, Awais Mansoor, Youngjin Yoo, Eli Gibson, R. S. Vishwanath, Abishek Balachandran, James M. Balter, Yue Cao, Ramandeep Singh(+4 more)

Figure 1 for Quantifying and Leveraging Predictive Uncertainty for Medical Image Assessment

Figure 2 for Quantifying and Leveraging Predictive Uncertainty for Medical Image Assessment

Figure 3 for Quantifying and Leveraging Predictive Uncertainty for Medical Image Assessment

Figure 4 for Quantifying and Leveraging Predictive Uncertainty for Medical Image Assessment

Abstract:The interpretation of medical images is a challenging task, often complicated by the presence of artifacts, occlusions, limited contrast and more. Most notable is the case of chest radiography, where there is a high inter-rater variability in the detection and classification of abnormalities. This is largely due to inconclusive evidence in the data or subjective definitions of disease appearance. An additional example is the classification of anatomical views based on 2D Ultrasound images. Often, the anatomical context captured in a frame is not sufficient to recognize the underlying anatomy. Current machine learning solutions for these problems are typically limited to providing probabilistic predictions, relying on the capacity of underlying models to adapt to limited information and the high degree of label noise. In practice, however, this leads to overconfident systems with poor generalization on unseen data. To account for this, we propose a system that learns not only the probabilistic estimate for classification, but also an explicit uncertainty measure which captures the confidence of the system in the predicted output. We argue that this approach is essential to account for the inherent ambiguity characteristic of medical images from different radiologic exams including computed radiography, ultrasonography and magnetic resonance imaging. In our experiments we demonstrate that sample rejection based on the predicted uncertainty can significantly improve the ROC-AUC for various tasks, e.g., by 8% to 0.91 with an expected rejection rate of under 25% for the classification of different abnormalities in chest radiographs. In addition, we show that using uncertainty-driven bootstrapping to filter the training data, one can achieve a significant increase in robustness and accuracy.

* Under review at Medical Image Analysis

Via

Access Paper or Ask Questions

Graph Attention Network based Pruning for Reconstructing 3D Liver Vessel Morphology from Contrasted CT Images

Mar 18, 2020

Donghao Zhang, Siqi Liu, Shikha Chaganti, Eli Gibson, Zhoubing Xu, Sasa Grbic, Weidong Cai, Dorin Comaniciu

Figure 1 for Graph Attention Network based Pruning for Reconstructing 3D Liver Vessel Morphology from Contrasted CT Images

Figure 2 for Graph Attention Network based Pruning for Reconstructing 3D Liver Vessel Morphology from Contrasted CT Images

Figure 3 for Graph Attention Network based Pruning for Reconstructing 3D Liver Vessel Morphology from Contrasted CT Images

Figure 4 for Graph Attention Network based Pruning for Reconstructing 3D Liver Vessel Morphology from Contrasted CT Images

Abstract:With the injection of contrast material into blood vessels, multi-phase contrasted CT images can enhance the visibility of vessel networks in the human body. Reconstructing the 3D geometric morphology of liver vessels from the contrasted CT images can enable multiple liver preoperative surgical planning applications. Automatic reconstruction of liver vessel morphology remains a challenging problem due to the morphological complexity of liver vessels and the inconsistent vessel intensities among different multi-phase contrasted CT images. On the other side, high integrity is required for the 3D reconstruction to avoid decision making biases. In this paper, we propose a framework for liver vessel morphology reconstruction using both a fully convolutional neural network and a graph attention network. A fully convolutional neural network is first trained to produce the liver vessel centerline heatmap. An over-reconstructed liver vessel graph model is then traced based on the heatmap using an image processing based algorithm. We use a graph attention network to prune the false-positive branches by predicting the presence probability of each segmented branch in the initial reconstruction using the aggregated CNN features. We evaluated the proposed framework on an in-house dataset consisting of 418 multi-phase abdomen CT images with contrast. The proposed graph network pruning improves the overall reconstruction F1 score by 6.4% over the baseline. It also outperformed the other state-of-the-art curvilinear structure reconstruction algorithms.

Via

Access Paper or Ask Questions

No Surprises: Training Robust Lung Nodule Detection for Low-Dose CT Scans by Augmenting with Adversarial Attacks

Mar 08, 2020

Siqi Liu, Arnaud Arindra Adiyoso Setio, Florin C. Ghesu, Eli Gibson, Sasa Grbic, Bogdan Georgescu, Dorin Comaniciu

Figure 1 for No Surprises: Training Robust Lung Nodule Detection for Low-Dose CT Scans by Augmenting with Adversarial Attacks

Figure 2 for No Surprises: Training Robust Lung Nodule Detection for Low-Dose CT Scans by Augmenting with Adversarial Attacks

Figure 3 for No Surprises: Training Robust Lung Nodule Detection for Low-Dose CT Scans by Augmenting with Adversarial Attacks

Figure 4 for No Surprises: Training Robust Lung Nodule Detection for Low-Dose CT Scans by Augmenting with Adversarial Attacks

Abstract:Detecting malignant pulmonary nodules at an early stage can allow medical interventions which increases the survival rate of lung cancer patients. Using computer vision techniques to detect nodules can improve the sensitivity and the speed of interpreting chest CT for lung cancer screening. Many studies have used CNNs to detect nodule candidates. Though such approaches have been shown to outperform the conventional image processing based methods regarding the detection accuracy, CNNs are also known to be limited to generalize on under-represented samples in the training set and prone to imperceptible noise perturbations. Such limitations can not be easily addressed by scaling up the dataset or the models. In this work, we propose to add adversarial synthetic nodules and adversarial attack samples to the training data to improve the generalization and the robustness of the lung nodule detection systems. In order to generate hard examples of nodules from a differentiable nodule synthesizer, we use projected gradient descent (PGD) to search the latent code within a bounded neighbourhood that would generate nodules to decrease the detector response. To make the network more robust to unanticipated noise perturbations, we use PGD to search for noise patterns that can trigger the network to give over-confident mistakes. By evaluating on two different benchmark datasets containing consensus annotations from three radiologists, we show that the proposed techniques can improve the detection performance on real CT data. To understand the limitations of both the conventional networks and the proposed augmented networks, we also perform stress-tests on the false positive reduction networks by feeding different types of artificially produced patches. We show that the augmented networks are more robust to both under-represented nodules as well as resistant to noise perturbations.

Via

Access Paper or Ask Questions

Conditional Segmentation in Lieu of Image Registration

Jun 30, 2019

Yipeng Hu, Eli Gibson, Dean C. Barratt, Mark Emberton, J. Alison Noble, Tom Vercauteren

Figure 1 for Conditional Segmentation in Lieu of Image Registration

Figure 2 for Conditional Segmentation in Lieu of Image Registration

Figure 3 for Conditional Segmentation in Lieu of Image Registration

Abstract:Classical pairwise image registration methods search for a spatial transformation that optimises a numerical measure that indicates how well a pair of moving and fixed images are aligned. Current learning-based registration methods have adopted the same paradigm and typically predict, for any new input image pair, dense correspondences in the form of a dense displacement field or parameters of a spatial transformation model. However, in many applications of registration, the spatial transformation itself is only required to propagate points or regions of interest (ROIs). In such cases, detailed pixel- or voxel-level correspondence within or outside of these ROIs often have little clinical value. In this paper, we propose an alternative paradigm in which the location of corresponding image-specific ROIs, defined in one image, within another image is learnt. This results in replacing image registration by a conditional segmentation algorithm, which can build on typical image segmentation networks and their widely-adopted training strategies. Using the registration of 3D MRI and ultrasound images of the prostate as an example to demonstrate this new approach, we report a median target registration error (TRE) of 2.1 mm between the ground-truth ROIs defined on intraoperative ultrasound images and those propagated from the preoperative MR images. Significantly lower (>34%) TREs were obtained using the proposed conditional segmentation compared with those obtained from a previously-proposed spatial-transformation-predicting registration network trained with the same multiple ROI labels for individual image pairs. We conclude this work by using a quantitative bias-variance analysis to provide one explanation of the observed improvement in registration accuracy.

* Accepted to MICCAI 2019

Via

Access Paper or Ask Questions

Quantifying and Leveraging Classification Uncertainty for Chest Radiograph Assessment

Jun 18, 2019

Florin C. Ghesu, Bogdan Georgescu, Eli Gibson, Sebastian Guendel, Mannudeep K. Kalra, Ramandeep Singh, Subba R. Digumarthy, Sasa Grbic, Dorin Comaniciu

Figure 1 for Quantifying and Leveraging Classification Uncertainty for Chest Radiograph Assessment

Figure 2 for Quantifying and Leveraging Classification Uncertainty for Chest Radiograph Assessment

Figure 3 for Quantifying and Leveraging Classification Uncertainty for Chest Radiograph Assessment

Figure 4 for Quantifying and Leveraging Classification Uncertainty for Chest Radiograph Assessment

Abstract:The interpretation of chest radiographs is an essential task for the detection of thoracic diseases and abnormalities. However, it is a challenging problem with high inter-rater variability and inherent ambiguity due to inconclusive evidence in the data, limited data quality or subjective definitions of disease appearance. Current deep learning solutions for chest radiograph abnormality classification are typically limited to providing probabilistic predictions, relying on the capacity of learning models to adapt to the high degree of label noise and become robust to the enumerated causal factors. In practice, however, this leads to overconfident systems with poor generalization on unseen data. To account for this, we propose an automatic system that learns not only the probabilistic estimate on the presence of an abnormality, but also an explicit uncertainty measure which captures the confidence of the system in the predicted output. We argue that explicitly learning the classification uncertainty as an orthogonal measure to the predicted output, is essential to account for the inherent variability characteristic of this data. Experiments were conducted on two datasets of chest radiographs of over 85,000 patients. Sample rejection based on the predicted uncertainty can significantly improve the ROC-AUC, e.g., by 8% to 0.91 with an expected rejection rate of under 25%. Eliminating training samples using uncertainty-driven bootstrapping, enables a significant increase in robustness and accuracy. In addition, we present a multi-reader study showing that the predictive uncertainty is indicative of reader errors.

* Accepted for presentation at MICCAI 2019

Via

Access Paper or Ask Questions

Multi-task Learning for Chest X-ray Abnormality Classification on Noisy Labels

May 15, 2019

Sebastian Guendel, Florin C. Ghesu, Sasa Grbic, Eli Gibson, Bogdan Georgescu, Andreas Maier, Dorin Comaniciu

Figure 1 for Multi-task Learning for Chest X-ray Abnormality Classification on Noisy Labels

Figure 2 for Multi-task Learning for Chest X-ray Abnormality Classification on Noisy Labels

Figure 3 for Multi-task Learning for Chest X-ray Abnormality Classification on Noisy Labels

Figure 4 for Multi-task Learning for Chest X-ray Abnormality Classification on Noisy Labels

Abstract:Chest X-ray (CXR) is the most common X-ray examination performed in daily clinical practice for the diagnosis of various heart and lung abnormalities. The large amount of data to be read and reported, with 100+ studies per day for a single radiologist, poses a challenge in maintaining consistently high interpretation accuracy. In this work, we propose a method for the classification of different abnormalities based on CXR scans of the human body. The system is based on a novel multi-task deep learning architecture that in addition to the abnormality classification, supports the segmentation of the lungs and heart and classification of regions where the abnormality is located. We demonstrate that by training these tasks concurrently, one can increase the classification performance of the model. Experiments were performed on an extensive collection of 297,541 chest X-ray images from 86,876 patients, leading to a state-of-the-art performance level of 0.883 AUC on average for 12 different abnormalities. We also conducted a detailed performance analysis and compared the accuracy of our system with 3 board-certified radiologists. In this context, we highlight the high level of label noise inherent to this problem. On a reduced subset containing only cases with high confidence reference labels based on the consensus of the 3 radiologists, our system reached an average AUC of 0.945.

Via

Access Paper or Ask Questions

Class-Aware Adversarial Lung Nodule Synthesis in CT Images

Dec 28, 2018

Jie Yang, Siqi Liu, Sasa Grbic, Arnaud Arindra Adiyoso Setio, Zhoubing Xu, Eli Gibson, Guillaume Chabin, Bogdan Georgescu, Andrew F. Laine, Dorin Comaniciu

Figure 1 for Class-Aware Adversarial Lung Nodule Synthesis in CT Images

Figure 2 for Class-Aware Adversarial Lung Nodule Synthesis in CT Images

Figure 3 for Class-Aware Adversarial Lung Nodule Synthesis in CT Images

Figure 4 for Class-Aware Adversarial Lung Nodule Synthesis in CT Images

Abstract:Though large-scale datasets are essential for training deep learning systems, it is expensive to scale up the collection of medical imaging datasets. Synthesizing the objects of interests, such as lung nodules, in medical images based on the distribution of annotated datasets can be helpful for improving the supervised learning tasks, especially when the datasets are limited by size and class balance. In this paper, we propose the class-aware adversarial synthesis framework to synthesize lung nodules in CT images. The framework is built with a coarse-to-fine patch in-painter (generator) and two class-aware discriminators. By conditioning on the random latent variables and the target nodule labels, the trained networks are able to generate diverse nodules given the same context. By evaluating on the public LIDC-IDRI dataset, we demonstrate an example application of the proposed framework for improving the accuracy of the lung nodule malignancy estimation as a binary classification problem, which is important in the lung screening scenario. We show that combining the real image patches and the synthetic lung nodules in the training set can improve the mean AUC classification score across different network architectures by 2%.

Via

Access Paper or Ask Questions