Picture for Christiane Ahlheim

Christiane Ahlheim

PERL: Parameter Efficient Reinforcement Learning from Human Feedback

Add code
Mar 15, 2024
Viaarxiv icon