Picture for Daria Soboleva

Daria Soboleva

Charles

Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs

Add code
Feb 21, 2025
Viaarxiv icon

Position Interpolation Improves ALiBi Extrapolation

Add code
Oct 18, 2023
Viaarxiv icon

BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model

Add code
Sep 20, 2023
Figure 1 for BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
Figure 2 for BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
Figure 3 for BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
Figure 4 for BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
Viaarxiv icon

SlimPajama-DC: Understanding Data Combinations for LLM Training

Add code
Sep 19, 2023
Figure 1 for SlimPajama-DC: Understanding Data Combinations for LLM Training
Figure 2 for SlimPajama-DC: Understanding Data Combinations for LLM Training
Figure 3 for SlimPajama-DC: Understanding Data Combinations for LLM Training
Figure 4 for SlimPajama-DC: Understanding Data Combinations for LLM Training
Viaarxiv icon

Replacing Human Audio with Synthetic Audio for On-device Unspoken Punctuation Prediction

Add code
Oct 20, 2020
Figure 1 for Replacing Human Audio with Synthetic Audio for On-device Unspoken Punctuation Prediction
Figure 2 for Replacing Human Audio with Synthetic Audio for On-device Unspoken Punctuation Prediction
Figure 3 for Replacing Human Audio with Synthetic Audio for On-device Unspoken Punctuation Prediction
Figure 4 for Replacing Human Audio with Synthetic Audio for On-device Unspoken Punctuation Prediction
Viaarxiv icon