Picture for Heewoong Choi

Heewoong Choi

Listwise Reward Estimation for Offline Preference-based Reinforcement Learning

Add code
Aug 08, 2024
Viaarxiv icon