Picture for Muhammad Ferjad Naeem

Muhammad Ferjad Naeem

Active Data Curation Effectively Distills Large-Scale Multimodal Models

Add code
Nov 27, 2024
Viaarxiv icon

TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Add code
Oct 30, 2024
Figure 1 for TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Figure 2 for TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Figure 3 for TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Figure 4 for TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Viaarxiv icon

Toward a Diffusion-Based Generalist for Dense Vision Tasks

Add code
Jun 29, 2024
Figure 1 for Toward a Diffusion-Based Generalist for Dense Vision Tasks
Figure 2 for Toward a Diffusion-Based Generalist for Dense Vision Tasks
Figure 3 for Toward a Diffusion-Based Generalist for Dense Vision Tasks
Figure 4 for Toward a Diffusion-Based Generalist for Dense Vision Tasks
Viaarxiv icon

How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs

Add code
May 08, 2024
Viaarxiv icon

Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs

Add code
May 06, 2024
Viaarxiv icon

GiT: Towards Generalist Vision Transformer through Universal Language Interface

Add code
Mar 14, 2024
Figure 1 for GiT: Towards Generalist Vision Transformer through Universal Language Interface
Figure 2 for GiT: Towards Generalist Vision Transformer through Universal Language Interface
Figure 3 for GiT: Towards Generalist Vision Transformer through Universal Language Interface
Figure 4 for GiT: Towards Generalist Vision Transformer through Universal Language Interface
Viaarxiv icon

FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks

Add code
Mar 11, 2024
Viaarxiv icon

Learning to Prompt with Text Only Supervision for Vision-Language Models

Add code
Jan 04, 2024
Viaarxiv icon

SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance

Add code
Nov 27, 2023
Viaarxiv icon

SILC: Improving Vision Language Pretraining with Self-Distillation

Add code
Oct 20, 2023
Figure 1 for SILC: Improving Vision Language Pretraining with Self-Distillation
Figure 2 for SILC: Improving Vision Language Pretraining with Self-Distillation
Figure 3 for SILC: Improving Vision Language Pretraining with Self-Distillation
Figure 4 for SILC: Improving Vision Language Pretraining with Self-Distillation
Viaarxiv icon