Picture for Shiming Xie

Shiming Xie

Minor DPO reject penalty to increase training robustness

Add code
Aug 22, 2024
Viaarxiv icon

Minor SFT loss for LLM fine-tune to increase performance and reduce model deviation

Add code
Aug 20, 2024
Viaarxiv icon