Picture for Vaishaal Shankar

Vaishaal Shankar

TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining

Add code
Apr 02, 2025
Viaarxiv icon

Datasets, Documents, and Repetitions: The Practicalities of Unequal Data Quality

Add code
Mar 10, 2025
Viaarxiv icon

Apple Intelligence Foundation Language Models

Add code
Jul 29, 2024
Figure 1 for Apple Intelligence Foundation Language Models
Figure 2 for Apple Intelligence Foundation Language Models
Figure 3 for Apple Intelligence Foundation Language Models
Figure 4 for Apple Intelligence Foundation Language Models
Viaarxiv icon

DataComp-LM: In search of the next generation of training sets for language models

Add code
Jun 18, 2024
Figure 1 for DataComp-LM: In search of the next generation of training sets for language models
Figure 2 for DataComp-LM: In search of the next generation of training sets for language models
Figure 3 for DataComp-LM: In search of the next generation of training sets for language models
Figure 4 for DataComp-LM: In search of the next generation of training sets for language models
Viaarxiv icon

Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum

Add code
May 21, 2024
Figure 1 for Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
Figure 2 for Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
Figure 3 for Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
Figure 4 for Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
Viaarxiv icon

Language models scale reliably with over-training and on downstream tasks

Add code
Mar 13, 2024
Figure 1 for Language models scale reliably with over-training and on downstream tasks
Figure 2 for Language models scale reliably with over-training and on downstream tasks
Figure 3 for Language models scale reliably with over-training and on downstream tasks
Figure 4 for Language models scale reliably with over-training and on downstream tasks
Viaarxiv icon

Scalable Pre-training of Large Autoregressive Image Models

Add code
Jan 16, 2024
Figure 1 for Scalable Pre-training of Large Autoregressive Image Models
Figure 2 for Scalable Pre-training of Large Autoregressive Image Models
Figure 3 for Scalable Pre-training of Large Autoregressive Image Models
Figure 4 for Scalable Pre-training of Large Autoregressive Image Models
Viaarxiv icon

Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation

Add code
Nov 27, 2023
Viaarxiv icon

TiC-CLIP: Continual Training of CLIP Models

Add code
Oct 24, 2023
Figure 1 for TiC-CLIP: Continual Training of CLIP Models
Figure 2 for TiC-CLIP: Continual Training of CLIP Models
Figure 3 for TiC-CLIP: Continual Training of CLIP Models
Figure 4 for TiC-CLIP: Continual Training of CLIP Models
Viaarxiv icon

Robust multimodal models have outlier features and encode more concepts

Add code
Oct 19, 2023
Viaarxiv icon