Picture for Sambit Sahu

Sambit Sahu

Towards Scalable Customization and Deployment of Multi-Agent Systems for Enterprise Applications

Add code
Jun 16, 2026
Viaarxiv icon

T1-Bench: Benchmarking Multi-Scenario Agents in Real-World Domains

Add code
Jun 09, 2026
Viaarxiv icon

A History-Aware Visually Grounded Critic for Computer Use Agents

Add code
Jun 09, 2026
Viaarxiv icon

MemGym: a Long-Horizon Memory Environment for LLM Agents

Add code
May 20, 2026
Viaarxiv icon

AVSD: Adaptive-View Self-Distillation by Balancing Consensus and Teacher-Specific Privileged Signals

Add code
May 20, 2026
Viaarxiv icon

CoT-Guard: Small Models for Strong Monitoring

Add code
May 12, 2026
Viaarxiv icon

Your Model Diversity, Not Method, Determines Reasoning Strategy

Add code
Apr 12, 2026
Viaarxiv icon

Decomposing the Delta: What Do Models Actually Learn from Preference Pairs?

Add code
Apr 09, 2026
Viaarxiv icon

DIAL-SUMMER: A Structured Evaluation Framework of Hierarchical Errors in Dialogue Summaries

Add code
Feb 08, 2026
Viaarxiv icon

Routing with Generated Data: Annotation-Free LLM Skill Estimation and Expert Selection

Add code
Jan 14, 2026
Viaarxiv icon