Picture for Chaofan Tao

Chaofan Tao

OVD: On-policy Verbal Distillation

Add code
Jan 29, 2026
Viaarxiv icon

From Verifiable Dot to Reward Chain: Harnessing Verifiable Reference-based Rewards for Reinforcement Learning of Open-ended Generation

Add code
Jan 26, 2026
Viaarxiv icon

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models

Add code
Jan 20, 2026
Viaarxiv icon

MMDeepResearch-Bench: A Benchmark for Multimodal Deep Research Agents

Add code
Jan 18, 2026
Viaarxiv icon

SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving

Add code
Jan 07, 2026
Viaarxiv icon

MMFormalizer: Multimodal Autoformalization in the Wild

Add code
Jan 06, 2026
Viaarxiv icon

A1: Asynchronous Test-Time Scaling via Conformal Prediction

Add code
Sep 18, 2025
Figure 1 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Figure 2 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Figure 3 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Figure 4 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Viaarxiv icon

LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction

Add code
Sep 09, 2025
Figure 1 for LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction
Figure 2 for LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction
Figure 3 for LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction
Figure 4 for LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction
Viaarxiv icon

The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs

Add code
Jul 10, 2025
Figure 1 for The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs
Figure 2 for The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs
Figure 3 for The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs
Figure 4 for The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs
Viaarxiv icon

SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving

Add code
May 29, 2025
Figure 1 for SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
Figure 2 for SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
Figure 3 for SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
Figure 4 for SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
Viaarxiv icon