Picture for Lawrence Jang

Lawrence Jang

VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks

Add code
Oct 24, 2024
Figure 1 for VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks
Figure 2 for VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks
Figure 3 for VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks
Figure 4 for VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks
Viaarxiv icon

Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale

Add code
Sep 12, 2024
Figure 1 for Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale
Figure 2 for Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale
Figure 3 for Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale
Figure 4 for Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale
Viaarxiv icon

ICAL: Continual Learning of Multimodal Agents by Transforming Trajectories into Actionable Insights

Add code
Jun 20, 2024
Viaarxiv icon

VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks

Add code
Jan 24, 2024
Viaarxiv icon