Picture for Aravindh Mahendran

Aravindh Mahendran

Scaling Vision Transformers to 22 Billion Parameters

Add code
Feb 10, 2023
Viaarxiv icon

Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames

Add code
Feb 09, 2023
Figure 1 for Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames
Figure 2 for Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames
Figure 3 for Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames
Figure 4 for Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames
Viaarxiv icon

RUST: Latent Neural Scene Representations from Unposed Imagery

Add code
Nov 25, 2022
Figure 1 for RUST: Latent Neural Scene Representations from Unposed Imagery
Figure 2 for RUST: Latent Neural Scene Representations from Unposed Imagery
Figure 3 for RUST: Latent Neural Scene Representations from Unposed Imagery
Figure 4 for RUST: Latent Neural Scene Representations from Unposed Imagery
Viaarxiv icon

Iterative Patch Selection for High-Resolution Image Recognition

Add code
Oct 24, 2022
Viaarxiv icon

SAVi++: Towards End-to-End Object-Centric Learning from Real-World Videos

Add code
Jun 15, 2022
Figure 1 for SAVi++: Towards End-to-End Object-Centric Learning from Real-World Videos
Figure 2 for SAVi++: Towards End-to-End Object-Centric Learning from Real-World Videos
Figure 3 for SAVi++: Towards End-to-End Object-Centric Learning from Real-World Videos
Figure 4 for SAVi++: Towards End-to-End Object-Centric Learning from Real-World Videos
Viaarxiv icon

Object Scene Representation Transformer

Add code
Jun 14, 2022
Figure 1 for Object Scene Representation Transformer
Figure 2 for Object Scene Representation Transformer
Figure 3 for Object Scene Representation Transformer
Figure 4 for Object Scene Representation Transformer
Viaarxiv icon

Simple Open-Vocabulary Object Detection with Vision Transformers

Add code
May 12, 2022
Figure 1 for Simple Open-Vocabulary Object Detection with Vision Transformers
Figure 2 for Simple Open-Vocabulary Object Detection with Vision Transformers
Figure 3 for Simple Open-Vocabulary Object Detection with Vision Transformers
Figure 4 for Simple Open-Vocabulary Object Detection with Vision Transformers
Viaarxiv icon

Conditional Object-Centric Learning from Video

Add code
Nov 24, 2021
Figure 1 for Conditional Object-Centric Learning from Video
Figure 2 for Conditional Object-Centric Learning from Video
Figure 3 for Conditional Object-Centric Learning from Video
Figure 4 for Conditional Object-Centric Learning from Video
Viaarxiv icon

Differentiable Patch Selection for Image Recognition

Add code
Apr 07, 2021
Figure 1 for Differentiable Patch Selection for Image Recognition
Figure 2 for Differentiable Patch Selection for Image Recognition
Figure 3 for Differentiable Patch Selection for Image Recognition
Figure 4 for Differentiable Patch Selection for Image Recognition
Viaarxiv icon

Representation learning from videos in-the-wild: An object-centric approach

Add code
Oct 06, 2020
Figure 1 for Representation learning from videos in-the-wild: An object-centric approach
Figure 2 for Representation learning from videos in-the-wild: An object-centric approach
Figure 3 for Representation learning from videos in-the-wild: An object-centric approach
Figure 4 for Representation learning from videos in-the-wild: An object-centric approach
Viaarxiv icon