Intuitive Fine-Tuning: Towards Unifying SFT and RLHF into a Single Process

Add code
May 20, 2024
Figure 1 for Intuitive Fine-Tuning: Towards Unifying SFT and RLHF into a Single Process
Figure 2 for Intuitive Fine-Tuning: Towards Unifying SFT and RLHF into a Single Process
Figure 3 for Intuitive Fine-Tuning: Towards Unifying SFT and RLHF into a Single Process
Figure 4 for Intuitive Fine-Tuning: Towards Unifying SFT and RLHF into a Single Process

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: