Picture for Yuzhong Hong

Yuzhong Hong

Energy-Based Preference Model Offers Better Offline Alignment than the Bradley-Terry Preference Model

Add code
Dec 18, 2024
Viaarxiv icon

Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language Models

Add code
Dec 17, 2024
Figure 1 for Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language Models
Figure 2 for Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language Models
Figure 3 for Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language Models
Figure 4 for Preference-Oriented Supervised Fine-Tuning: Favoring Target Model Over Aligned Large Language Models
Viaarxiv icon