Picture for Yang Yu

Yang Yu

Tsinghua University

Non-Adversarial Imitation Learning Provably Free of Compounding Errors: The Role of Bellman Constraints

Add code
Mar 24, 2026
Viaarxiv icon

VLGOR: Visual-Language Knowledge Guided Offline Reinforcement Learning for Generalizable Agents

Add code
Mar 24, 2026
Viaarxiv icon

Off-Policy Value-Based Reinforcement Learning for Large Language Models

Add code
Mar 24, 2026
Viaarxiv icon

RLVR Training of LLMs Does Not Improve Thinking Ability for General QA: Evaluation Method and a Simple Solution

Add code
Mar 21, 2026
Viaarxiv icon

Speedup Patch: Learning a Plug-and-Play Policy to Accelerate Embodied Manipulation

Add code
Mar 21, 2026
Viaarxiv icon

Towards Practical World Model-based Reinforcement Learning for Vision-Language-Action Models

Add code
Mar 21, 2026
Viaarxiv icon

SIGMA: A Semantic-Grounded Instruction-Driven Generative Multi-Task Recommender at AliExpress

Add code
Feb 26, 2026
Viaarxiv icon

Structure-Aware Piano Accompaniment via Style Planning and Dataset-Aligned Pattern Retrieval

Add code
Feb 16, 2026
Viaarxiv icon

InfiCoEvalChain: A Blockchain-Based Decentralized Framework for Collaborative LLM Evaluation

Add code
Feb 09, 2026
Viaarxiv icon

POINTS-GUI-G: GUI-Grounding Journey

Add code
Feb 06, 2026
Viaarxiv icon