Picture for André Susano Pinto

André Susano Pinto

PaliGemma 2: A Family of Versatile VLMs for Transfer

Add code
Dec 04, 2024
Viaarxiv icon

JetFormer: An Autoregressive Generative Model of Raw Images and Text

Add code
Nov 29, 2024
Viaarxiv icon

PaliGemma: A versatile 3B VLM for transfer

Add code
Jul 10, 2024
Figure 1 for PaliGemma: A versatile 3B VLM for transfer
Figure 2 for PaliGemma: A versatile 3B VLM for transfer
Figure 3 for PaliGemma: A versatile 3B VLM for transfer
Figure 4 for PaliGemma: A versatile 3B VLM for transfer
Viaarxiv icon

LocCa: Visual Pretraining with Location-aware Captioners

Add code
Mar 28, 2024
Figure 1 for LocCa: Visual Pretraining with Location-aware Captioners
Figure 2 for LocCa: Visual Pretraining with Location-aware Captioners
Figure 3 for LocCa: Visual Pretraining with Location-aware Captioners
Figure 4 for LocCa: Visual Pretraining with Location-aware Captioners
Viaarxiv icon

A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision

Add code
Mar 30, 2023
Viaarxiv icon

Tuning computer vision models with task rewards

Add code
Feb 16, 2023
Viaarxiv icon

UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes

Add code
May 27, 2022
Figure 1 for UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Figure 2 for UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Figure 3 for UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Figure 4 for UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes
Viaarxiv icon

Learning to Merge Tokens in Vision Transformers

Add code
Feb 24, 2022
Figure 1 for Learning to Merge Tokens in Vision Transformers
Figure 2 for Learning to Merge Tokens in Vision Transformers
Figure 3 for Learning to Merge Tokens in Vision Transformers
Figure 4 for Learning to Merge Tokens in Vision Transformers
Viaarxiv icon

Scaling Vision with Sparse Mixture of Experts

Add code
Jun 10, 2021
Figure 1 for Scaling Vision with Sparse Mixture of Experts
Figure 2 for Scaling Vision with Sparse Mixture of Experts
Figure 3 for Scaling Vision with Sparse Mixture of Experts
Figure 4 for Scaling Vision with Sparse Mixture of Experts
Viaarxiv icon

Deep Ensembles for Low-Data Transfer Learning

Add code
Oct 19, 2020
Figure 1 for Deep Ensembles for Low-Data Transfer Learning
Figure 2 for Deep Ensembles for Low-Data Transfer Learning
Figure 3 for Deep Ensembles for Low-Data Transfer Learning
Figure 4 for Deep Ensembles for Low-Data Transfer Learning
Viaarxiv icon