Picture for Todor Mihaylov

Todor Mihaylov

Jack

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon

Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks

Add code
Feb 24, 2025
Figure 1 for Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks
Figure 2 for Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks
Figure 3 for Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks
Figure 4 for Correlating and Predicting Human Evaluations of Language Models from Natural Language Processing Benchmarks
Viaarxiv icon

Optimizing Pretraining Data Mixtures with LLM-Estimated Utility

Add code
Jan 20, 2025
Figure 1 for Optimizing Pretraining Data Mixtures with LLM-Estimated Utility
Figure 2 for Optimizing Pretraining Data Mixtures with LLM-Estimated Utility
Figure 3 for Optimizing Pretraining Data Mixtures with LLM-Estimated Utility
Figure 4 for Optimizing Pretraining Data Mixtures with LLM-Estimated Utility
Viaarxiv icon

The Llama 3 Herd of Models

Add code
Jul 31, 2024
Viaarxiv icon

Llama 2: Open Foundation and Fine-Tuned Chat Models

Add code
Jul 19, 2023
Figure 1 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Figure 2 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Figure 3 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Figure 4 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Viaarxiv icon

Understanding In-Context Learning via Supportive Pretraining Data

Add code
Jun 26, 2023
Figure 1 for Understanding In-Context Learning via Supportive Pretraining Data
Figure 2 for Understanding In-Context Learning via Supportive Pretraining Data
Figure 3 for Understanding In-Context Learning via Supportive Pretraining Data
Figure 4 for Understanding In-Context Learning via Supportive Pretraining Data
Viaarxiv icon

bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark

Add code
Jun 07, 2023
Figure 1 for bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark
Figure 2 for bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark
Figure 3 for bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark
Figure 4 for bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark
Viaarxiv icon

Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training

Add code
Jan 05, 2023
Figure 1 for Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
Figure 2 for Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
Figure 3 for Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
Figure 4 for Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
Viaarxiv icon

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

Add code
Dec 28, 2022
Figure 1 for OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
Figure 2 for OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
Figure 3 for OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
Figure 4 for OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
Viaarxiv icon

OPT: Open Pre-trained Transformer Language Models

Add code
May 05, 2022
Figure 1 for OPT: Open Pre-trained Transformer Language Models
Figure 2 for OPT: Open Pre-trained Transformer Language Models
Figure 3 for OPT: Open Pre-trained Transformer Language Models
Figure 4 for OPT: Open Pre-trained Transformer Language Models
Viaarxiv icon