Picture for Songyang Gao

Songyang Gao

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Add code
Dec 12, 2025
Viaarxiv icon

Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning

Add code
Dec 12, 2025
Viaarxiv icon

OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification

Add code
Dec 11, 2025
Viaarxiv icon

Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning

Add code
Jul 22, 2025
Figure 1 for Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning
Figure 2 for Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning
Figure 3 for Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning
Figure 4 for Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning
Viaarxiv icon

The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner

Add code
Jul 17, 2025
Viaarxiv icon

Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law

Add code
Jun 16, 2025
Figure 1 for Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law
Figure 2 for Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law
Figure 3 for Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law
Figure 4 for Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law
Viaarxiv icon

A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future

Add code
Apr 12, 2025
Viaarxiv icon

Unicorn: Text-Only Data Synthesis for Vision Language Model Training

Add code
Mar 28, 2025
Viaarxiv icon

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Add code
Feb 10, 2025
Viaarxiv icon

Are Your LLMs Capable of Stable Reasoning?

Add code
Dec 17, 2024
Figure 1 for Are Your LLMs Capable of Stable Reasoning?
Figure 2 for Are Your LLMs Capable of Stable Reasoning?
Figure 3 for Are Your LLMs Capable of Stable Reasoning?
Figure 4 for Are Your LLMs Capable of Stable Reasoning?
Viaarxiv icon