Picture for Yangqiu Song

Yangqiu Song

Reflect to Inform: Boosting Multimodal Reasoning via Information-Gain-Driven Verification

Add code
Mar 27, 2026
Viaarxiv icon

Training LLMs for Multi-Step Tool Orchestration with Constrained Data Synthesis and Graduated Rewards

Add code
Mar 25, 2026
Viaarxiv icon

OmniCompliance-100K: A Multi-Domain, Rule-Grounded, Real-World Safety Compliance Dataset

Add code
Mar 14, 2026
Viaarxiv icon

ErrorLLM: Modeling SQL Errors for Text-to-SQL Refinement

Add code
Mar 04, 2026
Viaarxiv icon

AMemGym: Interactive Memory Benchmarking for Assistants in Long-Horizon Conversations

Add code
Mar 02, 2026
Viaarxiv icon

NGDB-Zoo: Towards Efficient and Scalable Neural Graph Databases Training

Add code
Feb 25, 2026
Viaarxiv icon

HeaPA: Difficulty-Aware Heap Sampling and On-Policy Query Augmentation for LLM Reinforcement Learning

Add code
Jan 30, 2026
Viaarxiv icon

Do Reasoning Models Enhance Embedding Models?

Add code
Jan 29, 2026
Viaarxiv icon

$\mathbb{R}^{2k}$ is Theoretically Large Enough for Embedding-based Top-$k$ Retrieval

Add code
Jan 29, 2026
Viaarxiv icon

NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems

Add code
Jan 16, 2026
Viaarxiv icon