Picture for Dhruv Gautam

Dhruv Gautam

RefactorBench: Evaluating Stateful Reasoning in Language Agents Through Code

Add code
Mar 10, 2025
Viaarxiv icon