Picture for M. Jehanzeb Mirza

M. Jehanzeb Mirza

Teaching VLMs to Localize Specific Objects from In-context Examples

Add code
Nov 20, 2024
Viaarxiv icon

LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content

Add code
Oct 15, 2024
Viaarxiv icon

GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models

Add code
Oct 08, 2024
Viaarxiv icon

ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs

Add code
Jun 12, 2024
Viaarxiv icon

Into the Fog: Evaluating Multiple Object Tracking Robustness

Add code
Apr 12, 2024
Viaarxiv icon

Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs

Add code
Mar 19, 2024
Viaarxiv icon

Towards Multimodal In-Context Learning for Vision & Language Models

Add code
Mar 19, 2024
Viaarxiv icon

TAP: Targeted Prompting for Task Adaptive Generation of Textual Training Instances for Visual Classification

Add code
Sep 13, 2023
Viaarxiv icon

Sit Back and Relax: Learning to Drive Incrementally in All Weather Conditions

Add code
May 30, 2023
Viaarxiv icon

LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections

Add code
May 29, 2023
Viaarxiv icon