Picture for Thomas Coste

Thomas Coste

Lightweight Neural App Control

Add code
Oct 23, 2024
Viaarxiv icon

Bayesian Reward Models for LLM Alignment

Add code
Feb 20, 2024
Viaarxiv icon

Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning

Add code
Dec 22, 2023
Viaarxiv icon

Reward Model Ensembles Help Mitigate Overoptimization

Add code
Oct 04, 2023
Viaarxiv icon