Picture for Suchin Gururangan

Suchin Gururangan

Jack

The Llama 3 Herd of Models

Add code
Jul 31, 2024
Viaarxiv icon

DataComp-LM: In search of the next generation of training sets for language models

Add code
Jun 18, 2024
Figure 1 for DataComp-LM: In search of the next generation of training sets for language models
Figure 2 for DataComp-LM: In search of the next generation of training sets for language models
Figure 3 for DataComp-LM: In search of the next generation of training sets for language models
Figure 4 for DataComp-LM: In search of the next generation of training sets for language models
Viaarxiv icon

Language models scale reliably with over-training and on downstream tasks

Add code
Mar 13, 2024
Figure 1 for Language models scale reliably with over-training and on downstream tasks
Figure 2 for Language models scale reliably with over-training and on downstream tasks
Figure 3 for Language models scale reliably with over-training and on downstream tasks
Figure 4 for Language models scale reliably with over-training and on downstream tasks
Viaarxiv icon

LESS: Selecting Influential Data for Targeted Instruction Tuning

Add code
Feb 20, 2024
Viaarxiv icon

Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models

Add code
Jan 19, 2024
Viaarxiv icon

AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters

Add code
Jan 16, 2024
Viaarxiv icon

Time is Encoded in the Weights of Finetuned Language Models

Add code
Dec 30, 2023
Viaarxiv icon

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore

Add code
Aug 08, 2023
Viaarxiv icon

Information Flow Control in Machine Learning through Modular Model Architecture

Add code
Jun 05, 2023
Viaarxiv icon

Scaling Expert Language Models with Unsupervised Domain Discovery

Add code
Mar 24, 2023
Viaarxiv icon