Picture for Alexander Robey

Alexander Robey

Jailbreaking LLM-Controlled Robots

Add code
Oct 17, 2024
Figure 1 for Jailbreaking LLM-Controlled Robots
Figure 2 for Jailbreaking LLM-Controlled Robots
Figure 3 for Jailbreaking LLM-Controlled Robots
Figure 4 for Jailbreaking LLM-Controlled Robots
Viaarxiv icon

Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation

Add code
Mar 28, 2024
Figure 1 for Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation
Figure 2 for Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation
Figure 3 for Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation
Figure 4 for Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation
Viaarxiv icon

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

Add code
Mar 28, 2024
Figure 1 for JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Figure 2 for JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Figure 3 for JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Figure 4 for JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
Viaarxiv icon

A Safe Harbor for AI Evaluation and Red Teaming

Add code
Mar 07, 2024
Figure 1 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 2 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 3 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 4 for A Safe Harbor for AI Evaluation and Red Teaming
Viaarxiv icon

Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing

Add code
Feb 28, 2024
Figure 1 for Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing
Figure 2 for Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing
Figure 3 for Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing
Figure 4 for Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing
Viaarxiv icon

Data-Driven Modeling and Verification of Perception-Based Autonomous Systems

Add code
Dec 11, 2023
Viaarxiv icon

SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks

Add code
Oct 13, 2023
Figure 1 for SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks
Figure 2 for SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks
Figure 3 for SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks
Figure 4 for SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks
Viaarxiv icon

Jailbreaking Black Box Large Language Models in Twenty Queries

Add code
Oct 13, 2023
Figure 1 for Jailbreaking Black Box Large Language Models in Twenty Queries
Figure 2 for Jailbreaking Black Box Large Language Models in Twenty Queries
Figure 3 for Jailbreaking Black Box Large Language Models in Twenty Queries
Figure 4 for Jailbreaking Black Box Large Language Models in Twenty Queries
Viaarxiv icon

Adversarial Training Should Be Cast as a Non-Zero-Sum Game

Add code
Jun 19, 2023
Viaarxiv icon

Probable Domain Generalization via Quantile Risk Minimization

Add code
Jul 20, 2022
Figure 1 for Probable Domain Generalization via Quantile Risk Minimization
Figure 2 for Probable Domain Generalization via Quantile Risk Minimization
Figure 3 for Probable Domain Generalization via Quantile Risk Minimization
Figure 4 for Probable Domain Generalization via Quantile Risk Minimization
Viaarxiv icon