Picture for Yafu Li

Yafu Li

CoTEvol: Self-Evolving Chain-of-Thoughts for Data Synthesis in Mathematical Reasoning

Add code
Apr 16, 2026
Viaarxiv icon

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

Add code
Apr 08, 2026
Viaarxiv icon

GEMS: Agent-Native Multimodal Generation with Memory and Skills

Add code
Mar 30, 2026
Viaarxiv icon

Characterizing, Evaluating, and Optimizing Complex Reasoning

Add code
Feb 09, 2026
Viaarxiv icon

New Skills or Sharper Primitives? A Probabilistic Perspective on the Emergence of Reasoning in RLVR

Add code
Feb 09, 2026
Viaarxiv icon

LatentMem: Customizing Latent Memory for Multi-Agent Systems

Add code
Feb 03, 2026
Viaarxiv icon

Learning to Reason Faithfully through Step-Level Faithfulness Maximization

Add code
Feb 03, 2026
Viaarxiv icon

DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models

Add code
Dec 30, 2025
Viaarxiv icon

VideoSSR: Video Self-Supervised Reinforcement Learning

Add code
Nov 09, 2025
Viaarxiv icon

ExGRPO: Learning to Reason from Experience

Add code
Oct 02, 2025
Figure 1 for ExGRPO: Learning to Reason from Experience
Figure 2 for ExGRPO: Learning to Reason from Experience
Figure 3 for ExGRPO: Learning to Reason from Experience
Figure 4 for ExGRPO: Learning to Reason from Experience
Viaarxiv icon