Picture for Mansheej Paul

Mansheej Paul

Scaling Laws for Precision

Add code
Nov 07, 2024
Viaarxiv icon

Critique-out-Loud Reward Models

Add code
Aug 21, 2024
Viaarxiv icon

Does your data spark joy? Performance gains from domain upsampling at the end of training

Add code
Jun 05, 2024
Figure 1 for Does your data spark joy? Performance gains from domain upsampling at the end of training
Figure 2 for Does your data spark joy? Performance gains from domain upsampling at the end of training
Figure 3 for Does your data spark joy? Performance gains from domain upsampling at the end of training
Figure 4 for Does your data spark joy? Performance gains from domain upsampling at the end of training
Viaarxiv icon

Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models

Add code
May 30, 2024
Viaarxiv icon

LoRA Learns Less and Forgets Less

Add code
May 15, 2024
Figure 1 for LoRA Learns Less and Forgets Less
Figure 2 for LoRA Learns Less and Forgets Less
Figure 3 for LoRA Learns Less and Forgets Less
Figure 4 for LoRA Learns Less and Forgets Less
Viaarxiv icon

Pretraining task diversity and the emergence of non-Bayesian in-context learning for regression

Add code
Jun 26, 2023
Viaarxiv icon

Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?

Add code
Oct 06, 2022
Figure 1 for Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
Figure 2 for Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
Figure 3 for Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
Figure 4 for Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
Viaarxiv icon

Lottery Tickets on a Data Diet: Finding Initializations with Sparse Trainable Networks

Add code
Jun 02, 2022
Figure 1 for Lottery Tickets on a Data Diet: Finding Initializations with Sparse Trainable Networks
Figure 2 for Lottery Tickets on a Data Diet: Finding Initializations with Sparse Trainable Networks
Figure 3 for Lottery Tickets on a Data Diet: Finding Initializations with Sparse Trainable Networks
Figure 4 for Lottery Tickets on a Data Diet: Finding Initializations with Sparse Trainable Networks
Viaarxiv icon

Deep Learning on a Data Diet: Finding Important Examples Early in Training

Add code
Jul 15, 2021
Figure 1 for Deep Learning on a Data Diet: Finding Important Examples Early in Training
Figure 2 for Deep Learning on a Data Diet: Finding Important Examples Early in Training
Figure 3 for Deep Learning on a Data Diet: Finding Important Examples Early in Training
Figure 4 for Deep Learning on a Data Diet: Finding Important Examples Early in Training
Viaarxiv icon

Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel

Add code
Oct 28, 2020
Figure 1 for Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel
Figure 2 for Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel
Figure 3 for Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel
Figure 4 for Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel
Viaarxiv icon