Picture for Edoardo Debenedetti

Edoardo Debenedetti

Measuring Non-Adversarial Reproduction of Training Data in Large Language Models

Add code
Nov 15, 2024
Viaarxiv icon

Adversarial Search Engine Optimization for Large Language Models

Add code
Jun 26, 2024
Viaarxiv icon

AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents

Add code
Jun 19, 2024
Viaarxiv icon

Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition

Add code
Jun 12, 2024
Viaarxiv icon

AI Risk Management Should Incorporate Both Safety and Security

Add code
May 29, 2024
Viaarxiv icon

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

Add code
Mar 28, 2024
Figure 1 for JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Figure 2 for JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Figure 3 for JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Figure 4 for JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Viaarxiv icon

Scaling Compute Is Not All You Need for Adversarial Robustness

Add code
Dec 20, 2023
Viaarxiv icon

Privacy Side Channels in Machine Learning Systems

Add code
Sep 11, 2023
Viaarxiv icon

Evading Black-box Classifiers Without Breaking Eggs

Add code
Jun 05, 2023
Viaarxiv icon

A Light Recipe to Train Robust Vision Transformers

Add code
Sep 15, 2022
Figure 1 for A Light Recipe to Train Robust Vision Transformers
Figure 2 for A Light Recipe to Train Robust Vision Transformers
Figure 3 for A Light Recipe to Train Robust Vision Transformers
Figure 4 for A Light Recipe to Train Robust Vision Transformers
Viaarxiv icon