Picture for Kush Bhatia

Kush Bhatia

Automated Rewards via LLM-Generated Progress Functions

Add code
Oct 11, 2024
Viaarxiv icon

Cookbook: A framework for improving LLM generative abilities via programmatic data generating templates

Add code
Oct 07, 2024
Viaarxiv icon

The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry

Add code
Feb 06, 2024
Viaarxiv icon

SuperHF: Supervised Iterative Learning from Human Feedback

Add code
Oct 25, 2023
Viaarxiv icon

Skill-it! A Data-Driven Skills Framework for Understanding and Training Language Models

Add code
Jul 26, 2023
Viaarxiv icon

Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification

Add code
Jul 20, 2023
Viaarxiv icon

TART: A plug-and-play Transformer module for task-agnostic reasoning

Add code
Jun 13, 2023
Viaarxiv icon

Reward Learning as Doubly Nonparametric Bandits: Optimal Design and Scaling Laws

Add code
Feb 23, 2023
Figure 1 for Reward Learning as Doubly Nonparametric Bandits: Optimal Design and Scaling Laws
Figure 2 for Reward Learning as Doubly Nonparametric Bandits: Optimal Design and Scaling Laws
Viaarxiv icon

Congested Bandits: Optimal Routing via Short-term Resets

Add code
Jan 23, 2023
Viaarxiv icon

On the Sensitivity of Reward Inference to Misspecified Human Models

Add code
Dec 09, 2022
Figure 1 for On the Sensitivity of Reward Inference to Misspecified Human Models
Figure 2 for On the Sensitivity of Reward Inference to Misspecified Human Models
Figure 3 for On the Sensitivity of Reward Inference to Misspecified Human Models
Figure 4 for On the Sensitivity of Reward Inference to Misspecified Human Models
Viaarxiv icon