Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xianyu

OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

May 20, 2024

Jian Hu, Xibin Wu, Weixun Wang, Xianyu, Dehao Zhang, Yu Cao

Figure 1 for OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

Figure 2 for OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

Figure 3 for OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

Figure 4 for OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

Abstract:As large language models (LLMs) continue to grow by scaling laws, reinforcement learning from human feedback (RLHF) has gained significant attention due to its outstanding performance. However, unlike pretraining or fine-tuning a single model, scaling reinforcement learning from human feedback (RLHF) for training large language models poses coordination challenges across four models. We present OpenRLHF, an open-source framework enabling efficient RLHF scaling. Unlike existing RLHF frameworks that co-locate four models on the same GPUs, OpenRLHF re-designs scheduling for the models beyond 70B parameters using Ray, vLLM, and DeepSpeed, leveraging improved resource utilization and diverse training approaches. Integrating seamlessly with Hugging Face, OpenRLHF provides an out-of-the-box solution with optimized algorithms and launch scripts, which ensures user-friendliness. OpenRLHF implements RLHF, DPO, rejection sampling, and other alignment techniques. Empowering state-of-the-art LLM development, OpenRLHF's code is available at https://github.com/OpenLLMAI/OpenRLHF.

Via

Access Paper or Ask Questions