Picture for Jarvis Jin

Jarvis Jin

PERL: Parameter Efficient Reinforcement Learning from Human Feedback

Add code
Mar 15, 2024
Viaarxiv icon