Picture for Jiashun Liu

Jiashun Liu

Reformulate LLM Reinforcement Learning for Efficient Training under Black-box Discrepancy

Add code
Jun 09, 2026
Viaarxiv icon

When RL Fails after SFT: Rejuvenating Model Plasticity for Robust SFT-to-RL Handoff

Add code
Jun 07, 2026
Viaarxiv icon

Local Guidance, Global Impact: Gaussian-Reshaped Trust Region Unlocks Behavior Transitions

Add code
Jun 04, 2026
Viaarxiv icon

The Rank and Gradient Lost in Non-stationarity: Sample Weight Decay for Mitigating Plasticity Loss in Reinforcement Learning

Add code
Apr 02, 2026
Viaarxiv icon

Complementary Reinforcement Learning

Add code
Mar 18, 2026
Viaarxiv icon

CE-RM: A Pointwise Generative Reward Model Optimized via Two-Stage Rollout and Unified Criteria

Add code
Jan 28, 2026
Viaarxiv icon

Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Add code
Dec 31, 2025
Viaarxiv icon

Asymmetric Proximal Policy Optimization: mini-critics boost LLM reasoning

Add code
Oct 02, 2025
Viaarxiv icon

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Add code
Aug 11, 2025
Viaarxiv icon

The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning

Add code
Jun 16, 2025
Figure 1 for The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning
Figure 2 for The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning
Figure 3 for The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning
Figure 4 for The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning
Viaarxiv icon