Picture for Andy Peng

Andy Peng

Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data

Add code
Dec 10, 2024
Viaarxiv icon