Picture for Daniel Salz

Daniel Salz

TIPS: Text-Image Pretraining with Spatial Awareness

Add code
Oct 21, 2024
Figure 1 for TIPS: Text-Image Pretraining with Spatial Awareness
Figure 2 for TIPS: Text-Image Pretraining with Spatial Awareness
Figure 3 for TIPS: Text-Image Pretraining with Spatial Awareness
Figure 4 for TIPS: Text-Image Pretraining with Spatial Awareness
Viaarxiv icon

PaliGemma: A versatile 3B VLM for transfer

Add code
Jul 10, 2024
Figure 1 for PaliGemma: A versatile 3B VLM for transfer
Figure 2 for PaliGemma: A versatile 3B VLM for transfer
Figure 3 for PaliGemma: A versatile 3B VLM for transfer
Figure 4 for PaliGemma: A versatile 3B VLM for transfer
Viaarxiv icon

Improve Supervised Representation Learning with Masked Image Modeling

Add code
Dec 01, 2023
Viaarxiv icon

PaLI-3 Vision Language Models: Smaller, Faster, Stronger

Add code
Oct 17, 2023
Figure 1 for PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Figure 2 for PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Figure 3 for PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Figure 4 for PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Viaarxiv icon

PaLI-X: On Scaling up a Multilingual Vision and Language Model

Add code
May 29, 2023
Figure 1 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 2 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 3 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 4 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Viaarxiv icon

PaLI: A Jointly-Scaled Multilingual Language-Image Model

Add code
Sep 16, 2022
Figure 1 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 2 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 3 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Figure 4 for PaLI: A Jointly-Scaled Multilingual Language-Image Model
Viaarxiv icon