Picture for Shuyan Zhou

Shuyan Zhou

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Add code
Dec 18, 2024
Viaarxiv icon

Beyond Browsing: API-Based Web Agents

Add code
Oct 21, 2024
Viaarxiv icon

WebCanvas: Benchmarking Web Agents in Online Environments

Add code
Jun 18, 2024
Viaarxiv icon

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Add code
Apr 11, 2024
Figure 1 for OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Figure 2 for OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Figure 3 for OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Figure 4 for OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Viaarxiv icon

VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks

Add code
Jan 24, 2024
Viaarxiv icon

WebArena: A Realistic Web Environment for Building Autonomous Agents

Add code
Jul 25, 2023
Figure 1 for WebArena: A Realistic Web Environment for Building Autonomous Agents
Figure 2 for WebArena: A Realistic Web Environment for Building Autonomous Agents
Figure 3 for WebArena: A Realistic Web Environment for Building Autonomous Agents
Figure 4 for WebArena: A Realistic Web Environment for Building Autonomous Agents
Viaarxiv icon

Hierarchical Prompting Assists Large Language Model on Web Navigation

Add code
May 23, 2023
Viaarxiv icon

Bridging the Gap: A Survey on Integrating Feedback for Natural Language Generation

Add code
May 01, 2023
Viaarxiv icon

CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code

Add code
Feb 10, 2023
Viaarxiv icon

Causal Reasoning of Entities and Events in Procedural Texts

Add code
Jan 29, 2023
Figure 1 for Causal Reasoning of Entities and Events in Procedural Texts
Figure 2 for Causal Reasoning of Entities and Events in Procedural Texts
Figure 3 for Causal Reasoning of Entities and Events in Procedural Texts
Figure 4 for Causal Reasoning of Entities and Events in Procedural Texts
Viaarxiv icon