Picture for Dawn Song

Dawn Song

University of California, Berkeley

Same-Origin Policy for Agentic Browsers

Add code
Jun 12, 2026
Viaarxiv icon

AgentBeats: Agentifying Agent Assessment for Openness, Standardization, and Reproducibility

Add code
Jun 11, 2026
Viaarxiv icon

Representational Similarity and Model Behavior in Multi-Agent Interaction

Add code
Jun 05, 2026
Viaarxiv icon

Agents' Last Exam

Add code
Jun 03, 2026
Viaarxiv icon

CyberGym-E2E: Scalable Real-World Benchmark for AI Agents' End-to-End Cybersecurity Capabilities

Add code
Jun 03, 2026
Viaarxiv icon

Can Generalist Agents Automate Data Curation?

Add code
Jun 02, 2026
Viaarxiv icon

BenchEvolver: Frontier Task Synthesis via Solution-Centric Evolution

Add code
May 31, 2026
Viaarxiv icon

SCDBench: A Benchmark for LLM-Based Smart Contract Decompilers

Add code
May 27, 2026
Viaarxiv icon

Measuring Real-World Prompt Injection Attacks in LLM-based Resume Screening

Add code
May 27, 2026
Viaarxiv icon

MemFail: Stress-Testing Failure Modes of LLM Memory Systems

Add code
May 26, 2026
Viaarxiv icon