Picture for Sridhar Thiagarajan

Sridhar Thiagarajan

Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models

Add code
Dec 18, 2024
Viaarxiv icon

Finetuning Language Models to Emit Linguistic Expressions of Uncertainty

Add code
Sep 18, 2024
Viaarxiv icon

Sample Efficient Deep Reinforcement Learning via Local Planning

Add code
Jan 29, 2023
Viaarxiv icon