Picture for Dawn Song

Dawn Song

University of California, Berkeley

Progent: Programmable Privilege Control for LLM Agents

Add code
Apr 16, 2025
Viaarxiv icon

Climbing the Ladder of Reasoning: What LLMs Can-and Still Can't-Solve after SFT?

Add code
Apr 16, 2025
Viaarxiv icon

DataSentinel: A Game-Theoretic Detection of Prompt Injection Attacks

Add code
Apr 15, 2025
Viaarxiv icon

Assessing Judging Bias in Large Reasoning Models: An Empirical Study

Add code
Apr 14, 2025
Viaarxiv icon

Type-Constrained Code Generation with Language Models

Add code
Apr 12, 2025
Viaarxiv icon

Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs

Add code
Apr 07, 2025
Viaarxiv icon

SoK: Frontier AI's Impact on the Cybersecurity Landscape

Add code
Apr 07, 2025
Viaarxiv icon

An Illusion of Progress? Assessing the Current State of Web Agents

Add code
Apr 02, 2025
Viaarxiv icon

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models

Add code
Mar 19, 2025
Viaarxiv icon

Improving LLM Safety Alignment with Dual-Objective Optimization

Add code
Mar 05, 2025
Viaarxiv icon