Picture for Ludwig Schmidt

Ludwig Schmidt

Shammie

BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions

Add code
Nov 12, 2024
Viaarxiv icon

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Add code
Aug 16, 2024
Figure 1 for xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Figure 2 for xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Figure 3 for xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Figure 4 for xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Viaarxiv icon

Better Alignment with Instruction Back-and-Forth Translation

Add code
Aug 08, 2024
Figure 1 for Better Alignment with Instruction Back-and-Forth Translation
Figure 2 for Better Alignment with Instruction Back-and-Forth Translation
Figure 3 for Better Alignment with Instruction Back-and-Forth Translation
Figure 4 for Better Alignment with Instruction Back-and-Forth Translation
Viaarxiv icon

Resolving Discrepancies in Compute-Optimal Scaling of Language Models

Add code
Jun 27, 2024
Figure 1 for Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Figure 2 for Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Figure 3 for Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Figure 4 for Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Viaarxiv icon

DataComp-LM: In search of the next generation of training sets for language models

Add code
Jun 18, 2024
Figure 1 for DataComp-LM: In search of the next generation of training sets for language models
Figure 2 for DataComp-LM: In search of the next generation of training sets for language models
Figure 3 for DataComp-LM: In search of the next generation of training sets for language models
Figure 4 for DataComp-LM: In search of the next generation of training sets for language models
Viaarxiv icon

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens

Add code
Jun 17, 2024
Viaarxiv icon

Large Scale Transfer Learning for Tabular Data via Language Modeling

Add code
Jun 17, 2024
Viaarxiv icon

Why are Visually-Grounded Language Models Bad at Image Classification?

Add code
May 28, 2024
Viaarxiv icon

Multilingual Diversity Improves Vision-Language Representations

Add code
May 27, 2024
Viaarxiv icon

Getting it Right: Improving Spatial Consistency in Text-to-Image Models

Add code
Apr 01, 2024
Viaarxiv icon