Picture for Yafu Li

Yafu Li

Learning to Reason Faithfully through Step-Level Faithfulness Maximization

Add code
Feb 03, 2026
Viaarxiv icon

LatentMem: Customizing Latent Memory for Multi-Agent Systems

Add code
Feb 03, 2026
Viaarxiv icon

DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

Add code
Dec 30, 2025
Viaarxiv icon

VideoSSR: Video Self-Supervised Reinforcement Learning

Add code
Nov 09, 2025
Viaarxiv icon

ExGRPO: Learning to Reason from Experience

Add code
Oct 02, 2025
Figure 1 for ExGRPO: Learning to Reason from Experience
Figure 2 for ExGRPO: Learning to Reason from Experience
Figure 3 for ExGRPO: Learning to Reason from Experience
Figure 4 for ExGRPO: Learning to Reason from Experience
Viaarxiv icon

Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration

Add code
Sep 18, 2025
Viaarxiv icon

A Survey of Reinforcement Learning for Large Reasoning Models

Add code
Sep 10, 2025
Viaarxiv icon

Synthesizing Sheet Music Problems for Evaluation and Reinforcement Learning

Add code
Sep 04, 2025
Figure 1 for Synthesizing Sheet Music Problems for Evaluation and Reinforcement Learning
Figure 2 for Synthesizing Sheet Music Problems for Evaluation and Reinforcement Learning
Figure 3 for Synthesizing Sheet Music Problems for Evaluation and Reinforcement Learning
Figure 4 for Synthesizing Sheet Music Problems for Evaluation and Reinforcement Learning
Viaarxiv icon

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law

Add code
Jul 24, 2025
Figure 1 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Figure 2 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Figure 3 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Figure 4 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Viaarxiv icon

Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning

Add code
Jun 04, 2025
Figure 1 for Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning
Figure 2 for Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning
Figure 3 for Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning
Figure 4 for Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning
Viaarxiv icon