Picture for Max Bain

Max Bain

AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description

Add code
Jul 22, 2024
Figure 1 for AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description
Figure 2 for AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description
Figure 3 for AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description
Figure 4 for AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description
Viaarxiv icon

Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models

Add code
May 03, 2024
Figure 1 for Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models
Figure 2 for Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models
Figure 3 for Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models
Figure 4 for Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models
Viaarxiv icon

AutoAD III: The Prequel -- Back to the Pixels

Add code
Apr 22, 2024
Viaarxiv icon

Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models

Add code
Apr 18, 2024
Figure 1 for Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models
Figure 2 for Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models
Figure 3 for Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models
Figure 4 for Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models
Viaarxiv icon

AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description

Add code
Oct 10, 2023
Figure 1 for AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description
Figure 2 for AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description
Figure 3 for AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description
Figure 4 for AutoAD II: The Sequel -- Who, When, and What in Movie Audio Description
Viaarxiv icon

OxfordVGG Submission to the EGO4D AV Transcription Challenge

Add code
Jul 18, 2023
Viaarxiv icon

Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets

Add code
May 24, 2023
Viaarxiv icon

AutoAD: Movie Description in Context

Add code
Mar 29, 2023
Viaarxiv icon

WhisperX: Time-Accurate Speech Transcription of Long-Form Audio

Add code
Mar 01, 2023
Viaarxiv icon

A CLIP-Hitchhiker's Guide to Long Video Retrieval

Add code
May 17, 2022
Figure 1 for A CLIP-Hitchhiker's Guide to Long Video Retrieval
Figure 2 for A CLIP-Hitchhiker's Guide to Long Video Retrieval
Figure 3 for A CLIP-Hitchhiker's Guide to Long Video Retrieval
Figure 4 for A CLIP-Hitchhiker's Guide to Long Video Retrieval
Viaarxiv icon