Picture for Yingbin Liang

Yingbin Liang

On the Learning Dynamics of RLVR at the Edge of Competence

Add code
Feb 16, 2026
Viaarxiv icon

Constraint-Rectified Training for Efficient Chain-of-Thought

Add code
Feb 13, 2026
Viaarxiv icon

Reward Modeling for Reinforcement Learning-Based LLM Reasoning: Design, Challenges, and Evaluation

Add code
Feb 10, 2026
Viaarxiv icon

Learnable Chernoff Baselines for Inference-Time Alignment

Add code
Feb 08, 2026
Viaarxiv icon

Bridging Online and Offline RL: Contextual Bandit Learning for Multi-Turn Code Generation

Add code
Feb 03, 2026
Viaarxiv icon

ConvexBench: Can LLMs Recognize Convex Functions?

Add code
Feb 01, 2026
Viaarxiv icon

Mixture-of-Transformers Learn Faster: A Theoretical Study on Classification Problems

Add code
Oct 30, 2025
Viaarxiv icon

Monitoring State Transitions in Markovian Systems with Sampling Cost

Add code
Oct 25, 2025
Viaarxiv icon

Large Language Models Achieve Gold Medal Performance at International Astronomy & Astrophysics Olympiad

Add code
Oct 06, 2025
Viaarxiv icon

Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent

Add code
Aug 11, 2025
Viaarxiv icon