Picture for Benedikt Stroebl

Benedikt Stroebl

CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark

Add code
Sep 17, 2024
Viaarxiv icon

AI Agents That Matter

Add code
Jul 01, 2024
Viaarxiv icon