Picture for Qiu Jiantao

Qiu Jiantao

Unsupervised Topic Models are Data Mixers for Pre-training Language Models

Add code
Feb 24, 2025
Viaarxiv icon

Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining

Add code
Oct 10, 2024
Figure 1 for Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining
Figure 2 for Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining
Figure 3 for Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining
Figure 4 for Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining
Viaarxiv icon