Picture for Zeye Sun

Zeye Sun

Step-by-Step Mastery: Enhancing Soft Constraint Following Ability of Large Language Models

Add code
Jan 09, 2025
Viaarxiv icon

Minor DPO reject penalty to increase training robustness

Add code
Aug 22, 2024
Figure 1 for Minor DPO reject penalty to increase training robustness
Figure 2 for Minor DPO reject penalty to increase training robustness
Figure 3 for Minor DPO reject penalty to increase training robustness
Figure 4 for Minor DPO reject penalty to increase training robustness
Viaarxiv icon

Minor SFT loss for LLM fine-tune to increase performance and reduce model deviation

Add code
Aug 20, 2024
Viaarxiv icon