Picture for Wojciech Galuba

Wojciech Galuba

DINOv2: Learning Robust Visual Features without Supervision

Add code
Apr 14, 2023
Viaarxiv icon

Masked Autoencoders that Listen

Add code
Jul 13, 2022
Figure 1 for Masked Autoencoders that Listen
Figure 2 for Masked Autoencoders that Listen
Figure 3 for Masked Autoencoders that Listen
Figure 4 for Masked Autoencoders that Listen
Viaarxiv icon

FLAVA: A Foundational Language And Vision Alignment Model

Add code
Dec 08, 2021
Figure 1 for FLAVA: A Foundational Language And Vision Alignment Model
Figure 2 for FLAVA: A Foundational Language And Vision Alignment Model
Figure 3 for FLAVA: A Foundational Language And Vision Alignment Model
Figure 4 for FLAVA: A Foundational Language And Vision Alignment Model
Viaarxiv icon

Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI

Add code
Sep 16, 2021
Figure 1 for Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI
Figure 2 for Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI
Figure 3 for Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI
Figure 4 for Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI
Viaarxiv icon

Habitat 2.0: Training Home Assistants to Rearrange their Habitat

Add code
Jun 28, 2021
Figure 1 for Habitat 2.0: Training Home Assistants to Rearrange their Habitat
Figure 2 for Habitat 2.0: Training Home Assistants to Rearrange their Habitat
Figure 3 for Habitat 2.0: Training Home Assistants to Rearrange their Habitat
Figure 4 for Habitat 2.0: Training Home Assistants to Rearrange their Habitat
Viaarxiv icon

Human-Adversarial Visual Question Answering

Add code
Jun 04, 2021
Figure 1 for Human-Adversarial Visual Question Answering
Figure 2 for Human-Adversarial Visual Question Answering
Figure 3 for Human-Adversarial Visual Question Answering
Figure 4 for Human-Adversarial Visual Question Answering
Viaarxiv icon

TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text

Add code
May 12, 2021
Figure 1 for TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text
Figure 2 for TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text
Figure 3 for TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text
Figure 4 for TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text
Viaarxiv icon