Picture for Zachary S. Siegel

Zachary S. Siegel

CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark

Add code
Sep 17, 2024
Figure 1 for CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
Figure 2 for CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
Figure 3 for CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
Figure 4 for CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
Viaarxiv icon

BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval

Add code
Jul 16, 2024
Figure 1 for BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
Figure 2 for BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
Figure 3 for BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
Figure 4 for BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
Viaarxiv icon

AI Agents That Matter

Add code
Jul 01, 2024
Viaarxiv icon

Learning adaptive planning representations with natural language guidance

Add code
Dec 13, 2023
Viaarxiv icon