Picture for Yuzhong Hong

Yuzhong Hong

Energy-Based Preference Model Offers Better Offline Alignment than the Bradley-Terry Preference Model

Add code
Dec 18, 2024
Viaarxiv icon

Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language Models

Add code
Dec 17, 2024
Viaarxiv icon