Picture for Andrew Zisserman

Andrew Zisserman

DeepMind

Recognising BSL Fingerspelling in Continuous Signing Sequences

Add code
Mar 19, 2026
Viaarxiv icon

WISE: A Multimodal Search Engine for Visual Scenes, Audio, Objects, Faces, Speech, and Metadata

Add code
Feb 13, 2026
Viaarxiv icon

Perception Test 2025: Challenge Summary and a Unified VQA Extension

Add code
Jan 09, 2026
Viaarxiv icon

CountGD++: Generalized Prompting for Open-World Counting

Add code
Dec 29, 2025
Viaarxiv icon

Recurrent Video Masked Autoencoders

Add code
Dec 15, 2025
Viaarxiv icon

TARA: Simple and Efficient Time Aware Retrieval Adaptation of MLLMs for Video Understanding

Add code
Dec 15, 2025
Viaarxiv icon

Efficiently Reconstructing Dynamic Scenes One D4RT at a Time

Add code
Dec 10, 2025
Figure 1 for Efficiently Reconstructing Dynamic Scenes One D4RT at a Time
Figure 2 for Efficiently Reconstructing Dynamic Scenes One D4RT at a Time
Figure 3 for Efficiently Reconstructing Dynamic Scenes One D4RT at a Time
Figure 4 for Efficiently Reconstructing Dynamic Scenes One D4RT at a Time
Viaarxiv icon

Segment, Embed, and Align: A Universal Recipe for Aligning Subtitles to Signing

Add code
Dec 08, 2025
Viaarxiv icon

Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment

Add code
Dec 08, 2025
Figure 1 for Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment
Figure 2 for Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment
Figure 3 for Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment
Figure 4 for Lost in Translation, Found in Embeddings: Sign Language Translation and Alignment
Viaarxiv icon

Inferring Dynamic Physical Properties from Video Foundation Models

Add code
Oct 02, 2025
Viaarxiv icon