Picture for Beichen Zhang

Beichen Zhang

additional authors not shown

Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning

Add code
Oct 31, 2025
Viaarxiv icon

From Trial-and-Error to Improvement: A Systematic Analysis of LLM Exploration Mechanisms in RLVR

Add code
Aug 11, 2025
Viaarxiv icon

Towards Effective Code-Integrated Reasoning

Add code
May 30, 2025
Viaarxiv icon

Qwen3 Technical Report

Add code
May 14, 2025
Figure 1 for Qwen3 Technical Report
Figure 2 for Qwen3 Technical Report
Figure 3 for Qwen3 Technical Report
Figure 4 for Qwen3 Technical Report
Viaarxiv icon

Slow Thinking for Sequential Recommendation

Add code
Apr 13, 2025
Viaarxiv icon

START: Self-taught Reasoner with Tools

Add code
Mar 07, 2025
Viaarxiv icon

An Empirical Study on Eliciting and Improving R1-like Reasoning Models

Add code
Mar 06, 2025
Figure 1 for An Empirical Study on Eliciting and Improving R1-like Reasoning Models
Figure 2 for An Empirical Study on Eliciting and Improving R1-like Reasoning Models
Figure 3 for An Empirical Study on Eliciting and Improving R1-like Reasoning Models
Figure 4 for An Empirical Study on Eliciting and Improving R1-like Reasoning Models
Viaarxiv icon

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Add code
Jan 13, 2025
Figure 1 for The Lessons of Developing Process Reward Models in Mathematical Reasoning
Figure 2 for The Lessons of Developing Process Reward Models in Mathematical Reasoning
Figure 3 for The Lessons of Developing Process Reward Models in Mathematical Reasoning
Figure 4 for The Lessons of Developing Process Reward Models in Mathematical Reasoning
Viaarxiv icon

BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning

Add code
Jan 06, 2025
Figure 1 for BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning
Figure 2 for BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning
Figure 3 for BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning
Figure 4 for BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning
Viaarxiv icon

Qwen2.5 Technical Report

Add code
Dec 19, 2024
Figure 1 for Qwen2.5 Technical Report
Figure 2 for Qwen2.5 Technical Report
Figure 3 for Qwen2.5 Technical Report
Figure 4 for Qwen2.5 Technical Report
Viaarxiv icon