Picture for Jesse Dodge

Jesse Dodge

Establishing Task Scaling Laws via Compute-Efficient Model Ladders

Add code
Dec 05, 2024
Viaarxiv icon

Scalable Data Ablation Approximations for Language Models through Modular Training and Merging

Add code
Oct 21, 2024
Figure 1 for Scalable Data Ablation Approximations for Language Models through Modular Training and Merging
Figure 2 for Scalable Data Ablation Approximations for Language Models through Modular Training and Merging
Figure 3 for Scalable Data Ablation Approximations for Language Models through Modular Training and Merging
Figure 4 for Scalable Data Ablation Approximations for Language Models through Modular Training and Merging
Viaarxiv icon

Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging

Add code
Oct 16, 2024
Figure 1 for Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging
Figure 2 for Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging
Figure 3 for Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging
Figure 4 for Merge to Learn: Efficiently Adding Skills to Language Models with Model Merging
Viaarxiv icon

OLMES: A Standard for Language Model Evaluations

Add code
Jun 12, 2024
Figure 1 for OLMES: A Standard for Language Model Evaluations
Figure 2 for OLMES: A Standard for Language Model Evaluations
Figure 3 for OLMES: A Standard for Language Model Evaluations
Figure 4 for OLMES: A Standard for Language Model Evaluations
Viaarxiv icon

OLMo: Accelerating the Science of Language Models

Add code
Feb 07, 2024
Figure 1 for OLMo: Accelerating the Science of Language Models
Figure 2 for OLMo: Accelerating the Science of Language Models
Figure 3 for OLMo: Accelerating the Science of Language Models
Figure 4 for OLMo: Accelerating the Science of Language Models
Viaarxiv icon

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Add code
Jan 31, 2024
Figure 1 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Figure 2 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Figure 3 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Figure 4 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Viaarxiv icon

AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters

Add code
Jan 16, 2024
Viaarxiv icon

Paloma: A Benchmark for Evaluating Language Model Fit

Add code
Dec 16, 2023
Viaarxiv icon

Catwalk: A Unified Language Model Evaluation Framework for Many Datasets

Add code
Dec 15, 2023
Viaarxiv icon

What's In My Big Data?

Add code
Oct 31, 2023
Figure 1 for What's In My Big Data?
Figure 2 for What's In My Big Data?
Figure 3 for What's In My Big Data?
Figure 4 for What's In My Big Data?
Viaarxiv icon