Picture for Yang Yu

Yang Yu

Tsinghua University

Poster: ClawdGo: Endogenous Security Awareness Training for Autonomous AI Agents

Add code
Apr 27, 2026
Viaarxiv icon

On Benchmark Hacking in ML Contests: Modeling, Insights and Design

Add code
Apr 24, 2026
Viaarxiv icon

Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Apr 14, 2026
Viaarxiv icon

Adapting 2D Multi-Modal Large Language Model for 3D CT Image Analysis

Add code
Apr 11, 2026
Viaarxiv icon

ReinVBC: A Model-based Reinforcement Learning Approach to Vehicle Braking Controller

Add code
Apr 06, 2026
Viaarxiv icon

Off-Policy Value-Based Reinforcement Learning for Large Language Models

Add code
Mar 24, 2026
Viaarxiv icon

Non-Adversarial Imitation Learning Provably Free of Compounding Errors: The Role of Bellman Constraints

Add code
Mar 24, 2026
Viaarxiv icon

VLGOR: Visual-Language Knowledge Guided Offline Reinforcement Learning for Generalizable Agents

Add code
Mar 24, 2026
Viaarxiv icon

Speedup Patch: Learning a Plug-and-Play Policy to Accelerate Embodied Manipulation

Add code
Mar 21, 2026
Viaarxiv icon

RLVR Training of LLMs Does Not Improve Thinking Ability for General QA: Evaluation Method and a Simple Solution

Add code
Mar 21, 2026
Viaarxiv icon