Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging

Jul 10, 2024

Raza Imam, Mohammed Talha Alam, Umaima Rahman, Mohsen Guizani, Fakhri Karray

Figure 1 for CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging

Figure 2 for CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging

Figure 3 for CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging

Figure 4 for CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging

Share this with someone who'll enjoy it:

Abstract:Existing vision-text contrastive learning models enhance representation transferability and support zero-shot prediction by matching paired image and caption embeddings while pushing unrelated pairs apart. However, astronomical image-label datasets are significantly smaller compared to general image and label datasets available from the internet. We introduce CosmoCLIP, an astronomical image-text contrastive learning framework precisely fine-tuned on the pre-trained CLIP model using SpaceNet and BLIP-based captions. SpaceNet, attained via FLARE, constitutes ~13k optimally distributed images, while BLIP acts as a rich knowledge extractor. The rich semantics derived from this SpaceNet and BLIP descriptions, when learned contrastively, enable CosmoCLIP to achieve superior generalization across various in-domain and out-of-domain tasks. Our results demonstrate that CosmoCLIP is a straightforward yet powerful framework, significantly outperforming CLIP in zero-shot classification and image-text retrieval tasks.

* Accepted at SPAICE Conference, ECSAT, UK, 2024

View paper on

Share this with someone who'll enjoy it:

Title:CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging

Paper and Code