Picture for Yuxiao Dong

Yuxiao Dong

ReST-RL: Achieving Accurate Code Reasoning of LLMs with Optimized Self-Training and Decoding

Add code
Aug 27, 2025
Viaarxiv icon

ComputerRL: Scaling End-to-End Online Reinforcement Learning for Computer Use Agents

Add code
Aug 19, 2025
Viaarxiv icon

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Add code
Aug 08, 2025
Viaarxiv icon

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Add code
Jul 02, 2025
Viaarxiv icon

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search

Add code
Jun 13, 2025
Viaarxiv icon

SWE-Dev: Building Software Engineering Agents with Training and Inference Scaling

Add code
Jun 09, 2025
Viaarxiv icon

AndroidGen: Building an Android Language Agent under Data Scarcity

Add code
Apr 27, 2025
Viaarxiv icon

Controlling Large Language Model with Latent Actions

Add code
Mar 27, 2025
Viaarxiv icon

VPO: Aligning Text-to-Video Generation Models with Prompt Optimization

Add code
Mar 26, 2025
Viaarxiv icon

LongSafety: Evaluating Long-Context Safety of Large Language Models

Add code
Feb 24, 2025
Viaarxiv icon