Picture for Salman Khan

Salman Khan

UniMed-CLIP: Towards a Unified Image-Text Pretraining Paradigm for Diverse Medical Imaging Modalities

Add code
Dec 13, 2024
Viaarxiv icon

Diffusion-Enhanced Test-time Adaptation with Text and Image Augmentation

Add code
Dec 12, 2024
Viaarxiv icon

BiMediX2: Bio-Medical EXpert LMM for Diverse Medical Modalities

Add code
Dec 10, 2024
Viaarxiv icon

GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks

Add code
Nov 28, 2024
Figure 1 for GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks
Figure 2 for GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks
Figure 3 for GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks
Figure 4 for GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks
Viaarxiv icon

All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages

Add code
Nov 25, 2024
Figure 1 for All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Figure 2 for All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Figure 3 for All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Figure 4 for All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Viaarxiv icon

VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

Add code
Nov 07, 2024
Figure 1 for VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
Figure 2 for VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
Figure 3 for VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
Figure 4 for VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
Viaarxiv icon

ROAD-Waymo: Action Awareness at Scale for Autonomous Driving

Add code
Nov 03, 2024
Viaarxiv icon

COSNet: A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes

Add code
Oct 31, 2024
Viaarxiv icon

CAMEL-Bench: A Comprehensive Arabic LMM Benchmark

Add code
Oct 24, 2024
Viaarxiv icon

How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?

Add code
Oct 23, 2024
Figure 1 for How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?
Figure 2 for How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?
Figure 3 for How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?
Figure 4 for How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization?
Viaarxiv icon