Picture for Yangqiu Song

Yangqiu Song

ErrorLLM: Modeling SQL Errors for Text-to-SQL Refinement

Add code
Mar 04, 2026
Viaarxiv icon

AMemGym: Interactive Memory Benchmarking for Assistants in Long-Horizon Conversations

Add code
Mar 02, 2026
Viaarxiv icon

NGDB-Zoo: Towards Efficient and Scalable Neural Graph Databases Training

Add code
Feb 25, 2026
Viaarxiv icon

HeaPA: Difficulty-Aware Heap Sampling and On-Policy Query Augmentation for LLM Reinforcement Learning

Add code
Jan 30, 2026
Viaarxiv icon

Do Reasoning Models Enhance Embedding Models?

Add code
Jan 29, 2026
Viaarxiv icon

$\mathbb{R}^{2k}$ is Theoretically Large Enough for Embedding-based Top-$k$ Retrieval

Add code
Jan 29, 2026
Viaarxiv icon

NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems

Add code
Jan 16, 2026
Viaarxiv icon

NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents

Add code
Oct 08, 2025
Viaarxiv icon

The Cognitive Bandwidth Bottleneck: Shifting Long-Horizon Agent from Planning with Actions to Planning with Schemas

Add code
Oct 08, 2025
Viaarxiv icon

LLM-Hanabi: Evaluating Multi-Agent Gameplays with Theory-of-Mind and Rationale Inference in Imperfect Information Collaboration Game

Add code
Oct 06, 2025
Figure 1 for LLM-Hanabi: Evaluating Multi-Agent Gameplays with Theory-of-Mind and Rationale Inference in Imperfect Information Collaboration Game
Figure 2 for LLM-Hanabi: Evaluating Multi-Agent Gameplays with Theory-of-Mind and Rationale Inference in Imperfect Information Collaboration Game
Figure 3 for LLM-Hanabi: Evaluating Multi-Agent Gameplays with Theory-of-Mind and Rationale Inference in Imperfect Information Collaboration Game
Figure 4 for LLM-Hanabi: Evaluating Multi-Agent Gameplays with Theory-of-Mind and Rationale Inference in Imperfect Information Collaboration Game
Viaarxiv icon