Picture for Andrew Poulton

Andrew Poulton

Jack

Evaluation data contamination in LLMs: how do we measure it and (when) does it matter?

Add code
Nov 06, 2024
Viaarxiv icon

The Llama 3 Herd of Models

Add code
Jul 31, 2024
Viaarxiv icon

Quantifying Variance in Evaluation Benchmarks

Add code
Jun 14, 2024
Figure 1 for Quantifying Variance in Evaluation Benchmarks
Figure 2 for Quantifying Variance in Evaluation Benchmarks
Figure 3 for Quantifying Variance in Evaluation Benchmarks
Figure 4 for Quantifying Variance in Evaluation Benchmarks
Viaarxiv icon

Llama 2: Open Foundation and Fine-Tuned Chat Models

Add code
Jul 19, 2023
Figure 1 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Figure 2 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Figure 3 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Figure 4 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Viaarxiv icon

A Theory on Adam Instability in Large-Scale Machine Learning

Add code
Apr 25, 2023
Viaarxiv icon

Galactica: A Large Language Model for Science

Add code
Nov 16, 2022
Viaarxiv icon