Picture for Kieran Fraser

Kieran Fraser

MAD-MAX: Modular And Diverse Malicious Attack MiXtures for Automated LLM Red Teaming

Add code
Mar 08, 2025
Viaarxiv icon

Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs

Add code
Feb 21, 2025
Viaarxiv icon

Granite Guardian

Add code
Dec 10, 2024
Figure 1 for Granite Guardian
Figure 2 for Granite Guardian
Figure 3 for Granite Guardian
Figure 4 for Granite Guardian
Viaarxiv icon

MoJE: Mixture of Jailbreak Experts, Naive Tabular Classifiers as Guard for Prompt Attacks

Add code
Sep 27, 2024
Figure 1 for MoJE: Mixture of Jailbreak Experts, Naive Tabular Classifiers as Guard for Prompt Attacks
Figure 2 for MoJE: Mixture of Jailbreak Experts, Naive Tabular Classifiers as Guard for Prompt Attacks
Figure 3 for MoJE: Mixture of Jailbreak Experts, Naive Tabular Classifiers as Guard for Prompt Attacks
Figure 4 for MoJE: Mixture of Jailbreak Experts, Naive Tabular Classifiers as Guard for Prompt Attacks
Viaarxiv icon