Picture for Brian Goodrich

Brian Goodrich

Evaluating Language-Model Agents on Realistic Autonomous Tasks

Add code
Jan 04, 2024
Figure 1 for Evaluating Language-Model Agents on Realistic Autonomous Tasks
Figure 2 for Evaluating Language-Model Agents on Realistic Autonomous Tasks
Figure 3 for Evaluating Language-Model Agents on Realistic Autonomous Tasks
Figure 4 for Evaluating Language-Model Agents on Realistic Autonomous Tasks
Viaarxiv icon