Picture for Chandler Zhou

Chandler Zhou

Aligning Language Models with Offline Reinforcement Learning from Human Feedback

Add code
Aug 23, 2023
Viaarxiv icon