Picture for Kaiyan Zhang

Kaiyan Zhang

Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models

Add code
Sep 30, 2025
Viaarxiv icon

FlowRL: Matching Reward Distributions for LLM Reasoning

Add code
Sep 18, 2025
Viaarxiv icon

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Add code
Sep 11, 2025
Viaarxiv icon

AdsQA: Towards Advertisement Video Understanding

Add code
Sep 10, 2025
Viaarxiv icon

A Survey of Reinforcement Learning for Large Reasoning Models

Add code
Sep 10, 2025
Viaarxiv icon

Towards a Unified View of Large Language Model Post-Training

Add code
Sep 04, 2025
Figure 1 for Towards a Unified View of Large Language Model Post-Training
Figure 2 for Towards a Unified View of Large Language Model Post-Training
Figure 3 for Towards a Unified View of Large Language Model Post-Training
Figure 4 for Towards a Unified View of Large Language Model Post-Training
Viaarxiv icon

ReviewRL: Towards Automated Scientific Review with RL

Add code
Aug 14, 2025
Viaarxiv icon

SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks

Add code
Jul 01, 2025
Figure 1 for SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks
Figure 2 for SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks
Figure 3 for SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks
Figure 4 for SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks
Viaarxiv icon

Automating Exploratory Multiomics Research via Language Models

Add code
Jun 09, 2025
Figure 1 for Automating Exploratory Multiomics Research via Language Models
Figure 2 for Automating Exploratory Multiomics Research via Language Models
Figure 3 for Automating Exploratory Multiomics Research via Language Models
Figure 4 for Automating Exploratory Multiomics Research via Language Models
Viaarxiv icon

Self-Reflective Reinforcement Learning for Diffusion-based Image Reasoning Generation

Add code
May 28, 2025
Figure 1 for Self-Reflective Reinforcement Learning for Diffusion-based Image Reasoning Generation
Figure 2 for Self-Reflective Reinforcement Learning for Diffusion-based Image Reasoning Generation
Figure 3 for Self-Reflective Reinforcement Learning for Diffusion-based Image Reasoning Generation
Figure 4 for Self-Reflective Reinforcement Learning for Diffusion-based Image Reasoning Generation
Viaarxiv icon