Picture for David Grangier

David Grangier

The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining

Add code
Oct 02, 2025
Viaarxiv icon

Partial Parameter Updates for Efficient Distributed Training

Add code
Sep 26, 2025
Viaarxiv icon

Assessing the Role of Data Quality in Training Bilingual Language Models

Add code
Jun 15, 2025
Viaarxiv icon

Scaling Laws for Forgetting during Finetuning with Pretraining Data Injection

Add code
Feb 09, 2025
Viaarxiv icon

Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging

Add code
Feb 03, 2025
Viaarxiv icon

Training Bilingual LMs with Data Constraints in the Targeted Language

Add code
Nov 20, 2024
Figure 1 for Training Bilingual LMs with Data Constraints in the Targeted Language
Figure 2 for Training Bilingual LMs with Data Constraints in the Targeted Language
Figure 3 for Training Bilingual LMs with Data Constraints in the Targeted Language
Figure 4 for Training Bilingual LMs with Data Constraints in the Targeted Language
Viaarxiv icon

Aggregate-and-Adapt Natural Language Prompts for Downstream Generalization of CLIP

Add code
Oct 31, 2024
Figure 1 for Aggregate-and-Adapt Natural Language Prompts for Downstream Generalization of CLIP
Figure 2 for Aggregate-and-Adapt Natural Language Prompts for Downstream Generalization of CLIP
Figure 3 for Aggregate-and-Adapt Natural Language Prompts for Downstream Generalization of CLIP
Figure 4 for Aggregate-and-Adapt Natural Language Prompts for Downstream Generalization of CLIP
Viaarxiv icon

No Need to Talk: Asynchronous Mixture of Language Models

Add code
Oct 04, 2024
Figure 1 for No Need to Talk: Asynchronous Mixture of Language Models
Figure 2 for No Need to Talk: Asynchronous Mixture of Language Models
Figure 3 for No Need to Talk: Asynchronous Mixture of Language Models
Figure 4 for No Need to Talk: Asynchronous Mixture of Language Models
Viaarxiv icon

Dynamic Gradient Alignment for Online Data Mixing

Add code
Oct 03, 2024
Figure 1 for Dynamic Gradient Alignment for Online Data Mixing
Figure 2 for Dynamic Gradient Alignment for Online Data Mixing
Figure 3 for Dynamic Gradient Alignment for Online Data Mixing
Figure 4 for Dynamic Gradient Alignment for Online Data Mixing
Viaarxiv icon

The AdEMAMix Optimizer: Better, Faster, Older

Add code
Sep 05, 2024
Figure 1 for The AdEMAMix Optimizer: Better, Faster, Older
Figure 2 for The AdEMAMix Optimizer: Better, Faster, Older
Figure 3 for The AdEMAMix Optimizer: Better, Faster, Older
Figure 4 for The AdEMAMix Optimizer: Better, Faster, Older
Viaarxiv icon