Picture for Nazhou Liu

Nazhou Liu

Adaptive Group Policy Optimization: Towards Stable Training and Token-Efficient Reasoning

Add code
Mar 20, 2025
Viaarxiv icon