Picture for Ian Steneker

Ian Steneker

LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet

Add code
Aug 27, 2024
Figure 1 for LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Figure 2 for LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Figure 3 for LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Figure 4 for LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Viaarxiv icon

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

Add code
Mar 06, 2024
Figure 1 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Figure 2 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Figure 3 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Figure 4 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Viaarxiv icon