Picture for Joan Puigcerver

Joan Puigcerver

PaliGemma: A versatile 3B VLM for transfer

Add code
Jul 10, 2024
Figure 1 for PaliGemma: A versatile 3B VLM for transfer
Figure 2 for PaliGemma: A versatile 3B VLM for transfer
Figure 3 for PaliGemma: A versatile 3B VLM for transfer
Figure 4 for PaliGemma: A versatile 3B VLM for transfer
Viaarxiv icon

Routers in Vision Mixture of Experts: An Empirical Study

Add code
Jan 29, 2024
Viaarxiv icon

From Sparse to Soft Mixtures of Experts

Add code
Aug 02, 2023
Viaarxiv icon

Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

Add code
Jul 12, 2023
Viaarxiv icon

Scaling Vision Transformers to 22 Billion Parameters

Add code
Feb 10, 2023
Figure 1 for Scaling Vision Transformers to 22 Billion Parameters
Figure 2 for Scaling Vision Transformers to 22 Billion Parameters
Figure 3 for Scaling Vision Transformers to 22 Billion Parameters
Figure 4 for Scaling Vision Transformers to 22 Billion Parameters
Viaarxiv icon

Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective

Add code
Feb 06, 2023
Figure 1 for Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective
Figure 2 for Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective
Figure 3 for Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective
Figure 4 for Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective
Viaarxiv icon

Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints

Add code
Dec 09, 2022
Figure 1 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 2 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 3 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 4 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Viaarxiv icon

On the Adversarial Robustness of Mixture of Experts

Add code
Oct 19, 2022
Figure 1 for On the Adversarial Robustness of Mixture of Experts
Figure 2 for On the Adversarial Robustness of Mixture of Experts
Figure 3 for On the Adversarial Robustness of Mixture of Experts
Figure 4 for On the Adversarial Robustness of Mixture of Experts
Viaarxiv icon

Sparsity-Constrained Optimal Transport

Add code
Sep 30, 2022
Figure 1 for Sparsity-Constrained Optimal Transport
Figure 2 for Sparsity-Constrained Optimal Transport
Figure 3 for Sparsity-Constrained Optimal Transport
Figure 4 for Sparsity-Constrained Optimal Transport
Viaarxiv icon

PaLI: A Jointly-Scaled Multilingual Language-Image Model

Add code
Sep 16, 2022
Figure 1 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 2 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 3 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 4 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Viaarxiv icon