Picture for Oilvier Pietquin

Oilvier Pietquin

Self-Improving Robust Preference Optimization

Add code
Jun 03, 2024
Viaarxiv icon