Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Language Augmentation in CLIP for Improved Anatomy Detection on Multi-modal Medical Images

May 31, 2024

Mansi Kakkar, Dattesh Shanbhag, Chandan Aladahalli, Gurunath Reddy M

Figure 1 for Language Augmentation in CLIP for Improved Anatomy Detection on Multi-modal Medical Images

Figure 2 for Language Augmentation in CLIP for Improved Anatomy Detection on Multi-modal Medical Images

Figure 3 for Language Augmentation in CLIP for Improved Anatomy Detection on Multi-modal Medical Images

Figure 4 for Language Augmentation in CLIP for Improved Anatomy Detection on Multi-modal Medical Images

Share this with someone who'll enjoy it:

Abstract:Vision-language models have emerged as a powerful tool for previously challenging multi-modal classification problem in the medical domain. This development has led to the exploration of automated image description generation for multi-modal clinical scans, particularly for radiology report generation. Existing research has focused on clinical descriptions for specific modalities or body regions, leaving a gap for a model providing entire-body multi-modal descriptions. In this paper, we address this gap by automating the generation of standardized body station(s) and list of organ(s) across the whole body in multi-modal MR and CT radiological images. Leveraging the versatility of the Contrastive Language-Image Pre-training (CLIP), we refine and augment the existing approach through multiple experiments, including baseline model fine-tuning, adding station(s) as a superset for better correlation between organs, along with image and language augmentations. Our proposed approach demonstrates 47.6% performance improvement over baseline PubMedCLIP.

* $\copyright$ 2024 IEEE. Accepted in 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 2024

View paper on

Share this with someone who'll enjoy it:

Title:Language Augmentation in CLIP for Improved Anatomy Detection on Multi-modal Medical Images

Paper and Code