Picture for Tao Gui

Tao Gui

DFPO: Scaling Value Modeling via Distributional Flow towards Robust and Generalizable LLM Post-Training

Add code
Feb 05, 2026
Viaarxiv icon

Steering LLMs via Scalable Interactive Oversight

Add code
Feb 04, 2026
Viaarxiv icon

Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models

Add code
Feb 04, 2026
Viaarxiv icon

Learning Query-Specific Rubrics from Human Preferences for DeepResearch Report Generation

Add code
Feb 03, 2026
Viaarxiv icon

CL-bench: A Benchmark for Context Learning

Add code
Feb 03, 2026
Viaarxiv icon

ChartE$^{3}$: A Comprehensive Benchmark for End-to-End Chart Editing

Add code
Jan 29, 2026
Viaarxiv icon

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models

Add code
Jan 20, 2026
Viaarxiv icon

Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment

Add code
Jan 20, 2026
Viaarxiv icon

FRoM-W1: Towards General Humanoid Whole-Body Control with Language Instructions

Add code
Jan 19, 2026
Viaarxiv icon

Can Deep Research Agents Find and Organize? Evaluating the Synthesis Gap with Expert Taxonomies

Add code
Jan 18, 2026
Viaarxiv icon