Picture for Songyang Gao

Songyang Gao

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Add code
Mar 26, 2026
Viaarxiv icon

Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning

Add code
Dec 12, 2025
Viaarxiv icon

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Add code
Dec 12, 2025
Viaarxiv icon

OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification

Add code
Dec 11, 2025
Viaarxiv icon

Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning

Add code
Jul 22, 2025
Figure 1 for Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning
Figure 2 for Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning
Figure 3 for Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning
Figure 4 for Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning
Viaarxiv icon

The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner

Add code
Jul 17, 2025
Viaarxiv icon

Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law

Add code
Jun 16, 2025
Figure 1 for Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law
Figure 2 for Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law
Figure 3 for Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law
Figure 4 for Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law
Viaarxiv icon

A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future

Add code
Apr 12, 2025
Viaarxiv icon

Unicorn: Text-Only Data Synthesis for Vision Language Model Training

Add code
Mar 28, 2025
Viaarxiv icon

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Add code
Feb 10, 2025
Viaarxiv icon