Picture for Kaiyan Zhang

Kaiyan Zhang

MARS$^2$: Scaling Multi-Agent Tree Search via Reinforcement Learning for Code Generation

Add code
Apr 16, 2026
Viaarxiv icon

TIR-Agent: Training an Explorative and Efficient Agent for Image Restoration

Add code
Mar 29, 2026
Viaarxiv icon

How Far Can Unsupervised RLVR Scale LLM Training?

Add code
Mar 09, 2026
Viaarxiv icon

MARTI-MARS$^2$: Scaling Multi-Agent Self-Search via Reinforcement Learning for Code Generation

Add code
Feb 08, 2026
Viaarxiv icon

Emotion-Director: Bridging Affective Shortcut in Emotion-Oriented Image Generation

Add code
Dec 22, 2025
Viaarxiv icon

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Add code
Dec 18, 2025
Viaarxiv icon

Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models

Add code
Sep 30, 2025
Viaarxiv icon

FlowRL: Matching Reward Distributions for LLM Reasoning

Add code
Sep 18, 2025
Viaarxiv icon

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Add code
Sep 11, 2025
Viaarxiv icon

A Survey of Reinforcement Learning for Large Reasoning Models

Add code
Sep 10, 2025
Viaarxiv icon