Picture for Xing Han Lù

Xing Han Lù

Does The Way You Plan Matter? An Empirical Study of Planning Representations for LLM Web Agents

Add code
May 28, 2026
Viaarxiv icon

Weasel: Out-of-Domain Generalization for Web Agents via Importance-Diversity Data Selection

Add code
May 19, 2026
Viaarxiv icon

Structured Distillation of Web Agent Capabilities Enables Generalization

Add code
Apr 09, 2026
Viaarxiv icon

CUBE: A Standard for Unifying Agent Benchmarks

Add code
Mar 16, 2026
Viaarxiv icon

Grounding Computer Use Agents on Human Demonstrations

Add code
Nov 10, 2025
Figure 1 for Grounding Computer Use Agents on Human Demonstrations
Figure 2 for Grounding Computer Use Agents on Human Demonstrations
Figure 3 for Grounding Computer Use Agents on Human Demonstrations
Figure 4 for Grounding Computer Use Agents on Human Demonstrations
Viaarxiv icon

Build the web for agents, not agents for the web

Add code
Jun 12, 2025
Figure 1 for Build the web for agents, not agents for the web
Viaarxiv icon

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

Add code
Apr 11, 2025
Figure 1 for AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Figure 2 for AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Figure 3 for AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Figure 4 for AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Viaarxiv icon

SafeArena: Evaluating the Safety of Autonomous Web Agents

Add code
Mar 06, 2025
Figure 1 for SafeArena: Evaluating the Safety of Autonomous Web Agents
Figure 2 for SafeArena: Evaluating the Safety of Autonomous Web Agents
Figure 3 for SafeArena: Evaluating the Safety of Autonomous Web Agents
Figure 4 for SafeArena: Evaluating the Safety of Autonomous Web Agents
Viaarxiv icon

MMTEB: Massive Multilingual Text Embedding Benchmark

Add code
Feb 19, 2025
Viaarxiv icon

The BrowserGym Ecosystem for Web Agent Research

Add code
Dec 10, 2024
Figure 1 for The BrowserGym Ecosystem for Web Agent Research
Figure 2 for The BrowserGym Ecosystem for Web Agent Research
Figure 3 for The BrowserGym Ecosystem for Web Agent Research
Figure 4 for The BrowserGym Ecosystem for Web Agent Research
Viaarxiv icon