Picture for Max Bain

Max Bain

AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description

Add code
Jul 22, 2024
Viaarxiv icon

Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models

Add code
May 03, 2024
Figure 1 for Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models
Figure 2 for Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models
Figure 3 for Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models
Figure 4 for Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models
Viaarxiv icon

AutoAD III: The Prequel -- Back to the Pixels

Add code
Apr 22, 2024
Viaarxiv icon

Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models

Add code
Apr 18, 2024
Viaarxiv icon

AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description

Add code
Oct 10, 2023
Viaarxiv icon

OxfordVGG Submission to the EGO4D AV Transcription Challenge

Add code
Jul 18, 2023
Viaarxiv icon

Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets

Add code
May 24, 2023
Viaarxiv icon

AutoAD: Movie Description in Context

Add code
Mar 29, 2023
Viaarxiv icon

WhisperX: Time-Accurate Speech Transcription of Long-Form Audio

Add code
Mar 01, 2023
Viaarxiv icon

A CLIP-Hitchhiker's Guide to Long Video Retrieval

Add code
May 17, 2022
Figure 1 for A CLIP-Hitchhiker's Guide to Long Video Retrieval
Figure 2 for A CLIP-Hitchhiker's Guide to Long Video Retrieval
Figure 3 for A CLIP-Hitchhiker's Guide to Long Video Retrieval
Figure 4 for A CLIP-Hitchhiker's Guide to Long Video Retrieval
Viaarxiv icon