Picture for Daniel Kang

Daniel Kang

Voice-Enabled AI Agents can Perform Common Scams

Add code
Oct 21, 2024
Viaarxiv icon

Teams of LLM Agents can Exploit Zero-Day Vulnerabilities

Add code
Jun 02, 2024
Viaarxiv icon

LLM Agents can Autonomously Exploit One-day Vulnerabilities

Add code
Apr 11, 2024
Viaarxiv icon

Trustless Audits without Revealing Data or Models

Add code
Apr 06, 2024
Viaarxiv icon

A Safe Harbor for AI Evaluation and Red Teaming

Add code
Mar 07, 2024
Figure 1 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 2 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 3 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 4 for A Safe Harbor for AI Evaluation and Red Teaming
Viaarxiv icon

InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents

Add code
Mar 05, 2024
Viaarxiv icon

LLM Agents can Autonomously Hack Websites

Add code
Feb 16, 2024
Viaarxiv icon

Removing RLHF Protections in GPT-4 via Fine-Tuning

Add code
Nov 10, 2023
Viaarxiv icon

Identifying and Mitigating the Security Risks of Generative AI

Add code
Aug 28, 2023
Figure 1 for Identifying and Mitigating the Security Risks of Generative AI
Viaarxiv icon

Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks

Add code
Feb 11, 2023
Viaarxiv icon