Picture for Will Ellsworth

Will Ellsworth

Offline Regularised Reinforcement Learning for Large Language Models Alignment

Add code
May 29, 2024
Viaarxiv icon