Picture for Rohan Gumastate

Rohan Gumastate

Optimal Reward Labeling: Bridging Offline Preference and Reward-Based Reinforcement Learning

Add code
Jun 14, 2024
Viaarxiv icon