Picture for Lucas Smaira

Lucas Smaira

DeepMind

Perception Test: A Diagnostic Benchmark for Multimodal Video Models

Add code
May 23, 2023
Viaarxiv icon

Zorro: the masked multimodal transformer

Add code
Jan 23, 2023
Viaarxiv icon

TAP-Vid: A Benchmark for Tracking Any Point in a Video

Add code
Nov 07, 2022
Viaarxiv icon

Towards Learning Universal Audio Representations

Add code
Dec 01, 2021
Figure 1 for Towards Learning Universal Audio Representations
Figure 2 for Towards Learning Universal Audio Representations
Figure 3 for Towards Learning Universal Audio Representations
Figure 4 for Towards Learning Universal Audio Representations
Viaarxiv icon

Human-Agent Cooperation in Bridge Bidding

Add code
Nov 28, 2020
Figure 1 for Human-Agent Cooperation in Bridge Bidding
Viaarxiv icon

A Short Note on the Kinetics-700-2020 Human Action Dataset

Add code
Oct 21, 2020
Figure 1 for A Short Note on the Kinetics-700-2020 Human Action Dataset
Figure 2 for A Short Note on the Kinetics-700-2020 Human Action Dataset
Figure 3 for A Short Note on the Kinetics-700-2020 Human Action Dataset
Figure 4 for A Short Note on the Kinetics-700-2020 Human Action Dataset
Viaarxiv icon

Self-Supervised MultiModal Versatile Networks

Add code
Jun 29, 2020
Figure 1 for Self-Supervised MultiModal Versatile Networks
Figure 2 for Self-Supervised MultiModal Versatile Networks
Figure 3 for Self-Supervised MultiModal Versatile Networks
Figure 4 for Self-Supervised MultiModal Versatile Networks
Viaarxiv icon

Visual Grounding in Video for Unsupervised Word Translation

Add code
Mar 26, 2020
Figure 1 for Visual Grounding in Video for Unsupervised Word Translation
Figure 2 for Visual Grounding in Video for Unsupervised Word Translation
Figure 3 for Visual Grounding in Video for Unsupervised Word Translation
Figure 4 for Visual Grounding in Video for Unsupervised Word Translation
Viaarxiv icon

End-to-End Learning of Visual Representations from Uncurated Instructional Videos

Add code
Jan 17, 2020
Figure 1 for End-to-End Learning of Visual Representations from Uncurated Instructional Videos
Figure 2 for End-to-End Learning of Visual Representations from Uncurated Instructional Videos
Figure 3 for End-to-End Learning of Visual Representations from Uncurated Instructional Videos
Figure 4 for End-to-End Learning of Visual Representations from Uncurated Instructional Videos
Viaarxiv icon