Picture for Makarand Tapaswi

Makarand Tapaswi

CVIT, IIIT Hyderabad

The Sound of Water: Inferring Physical Properties from Pouring Liquids

Add code
Nov 18, 2024
Viaarxiv icon

IdentifyMe: A Challenging Long-Context Mention Resolution Benchmark

Add code
Nov 12, 2024
Figure 1 for IdentifyMe: A Challenging Long-Context Mention Resolution Benchmark
Figure 2 for IdentifyMe: A Challenging Long-Context Mention Resolution Benchmark
Figure 3 for IdentifyMe: A Challenging Long-Context Mention Resolution Benchmark
Figure 4 for IdentifyMe: A Challenging Long-Context Mention Resolution Benchmark
Viaarxiv icon

No Detail Left Behind: Revisiting Self-Retrieval for Fine-Grained Image Captioning

Add code
Sep 04, 2024
Viaarxiv icon

Major Entity Identification: A Generalizable Alternative to Coreference Resolution

Add code
Jun 20, 2024
Viaarxiv icon

VELOCITI: Can Video-Language Models Bind Semantic Concepts through Time?

Add code
Jun 16, 2024
Viaarxiv icon

"Previously on ..." From Recaps to Story Summarization

Add code
May 19, 2024
Viaarxiv icon

MICap: A Unified Model for Identity-aware Movie Descriptions

Add code
May 19, 2024
Viaarxiv icon

NurtureNet: A Multi-task Video-based Approach for Newborn Anthropometry

Add code
May 09, 2024
Viaarxiv icon

FiGCLIP: Fine-Grained CLIP Adaptation via Densely Annotated Videos

Add code
Jan 15, 2024
Viaarxiv icon

Eye vs. AI: Human Gaze and Model Attention in Video Memorability

Add code
Nov 26, 2023
Viaarxiv icon