Picture for Colin Raffel

Colin Raffel

Shammie

Realistic Evaluation of Model Merging for Compositional Generalization

Add code
Sep 26, 2024
Viaarxiv icon

A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning

Add code
Aug 13, 2024
Viaarxiv icon

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Add code
Jun 25, 2024
Viaarxiv icon

Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models

Add code
Apr 08, 2024
Figure 1 for Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models
Figure 2 for Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models
Figure 3 for Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models
Figure 4 for Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models
Viaarxiv icon

A Survey on Data Selection for Language Models

Add code
Mar 08, 2024
Viaarxiv icon

DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Add code
Feb 16, 2024
Viaarxiv icon

Learning to Route Among Specialized Experts for Zero-Shot Generalization

Add code
Feb 08, 2024
Viaarxiv icon

Distributed Inference and Fine-tuning of Large Language Models Over The Internet

Add code
Dec 13, 2023
Viaarxiv icon

Merging by Matching Models in Task Subspaces

Add code
Dec 07, 2023
Viaarxiv icon

Efficient Online Data Mixing For Language Model Pre-Training

Add code
Dec 05, 2023
Viaarxiv icon