Picture for Bingbing Xu

Bingbing Xu

CAS Key Laboratory of AI Safety, Institute of Computing Technology, CAS, Beijing, China, Tsinghua University, Beijing, China

Beyond Entangled Planning: Task-Decoupled Planning for Long-Horizon Agents

Add code
Jan 12, 2026
Viaarxiv icon

HAG: Hierarchical Demographic Tree-based Agent Generation for Topic-Adaptive Simulation

Add code
Jan 09, 2026
Viaarxiv icon

GIFT: Games as Informal Training for Generalizable LLMs

Add code
Jan 09, 2026
Viaarxiv icon

Learning from Mistakes: Negative Reasoning Samples Enhance Out-of-Domain Generalization

Add code
Jan 08, 2026
Viaarxiv icon

From Outcomes to Processes: Guiding PRM Learning from ORM for Inference-Time Alignment

Add code
Jun 14, 2025
Figure 1 for From Outcomes to Processes: Guiding PRM Learning from ORM for Inference-Time Alignment
Figure 2 for From Outcomes to Processes: Guiding PRM Learning from ORM for Inference-Time Alignment
Figure 3 for From Outcomes to Processes: Guiding PRM Learning from ORM for Inference-Time Alignment
Figure 4 for From Outcomes to Processes: Guiding PRM Learning from ORM for Inference-Time Alignment
Viaarxiv icon

KnowCoder-V2: Deep Knowledge Analysis

Add code
Jun 07, 2025
Figure 1 for KnowCoder-V2: Deep Knowledge Analysis
Figure 2 for KnowCoder-V2: Deep Knowledge Analysis
Figure 3 for KnowCoder-V2: Deep Knowledge Analysis
Figure 4 for KnowCoder-V2: Deep Knowledge Analysis
Viaarxiv icon

Incentivizing Strong Reasoning from Weak Supervision

Add code
May 28, 2025
Figure 1 for Incentivizing Strong Reasoning from Weak Supervision
Figure 2 for Incentivizing Strong Reasoning from Weak Supervision
Figure 3 for Incentivizing Strong Reasoning from Weak Supervision
Figure 4 for Incentivizing Strong Reasoning from Weak Supervision
Viaarxiv icon

Incentivizing Reasoning from Weak Supervision

Add code
May 26, 2025
Figure 1 for Incentivizing Reasoning from Weak Supervision
Figure 2 for Incentivizing Reasoning from Weak Supervision
Figure 3 for Incentivizing Reasoning from Weak Supervision
Figure 4 for Incentivizing Reasoning from Weak Supervision
Viaarxiv icon

Inference-time Alignment in Continuous Space

Add code
May 26, 2025
Viaarxiv icon

InfoNCE is a Free Lunch for Semantically guided Graph Contrastive Learning

Add code
May 07, 2025
Figure 1 for InfoNCE is a Free Lunch for Semantically guided Graph Contrastive Learning
Figure 2 for InfoNCE is a Free Lunch for Semantically guided Graph Contrastive Learning
Figure 3 for InfoNCE is a Free Lunch for Semantically guided Graph Contrastive Learning
Figure 4 for InfoNCE is a Free Lunch for Semantically guided Graph Contrastive Learning
Viaarxiv icon