Picture for Erick Galinkin

Erick Galinkin

Montreal AI Ethics Institute

Improved Large Language Model Jailbreak Detection via Pretrained Embeddings

Add code
Dec 02, 2024
Figure 1 for Improved Large Language Model Jailbreak Detection via Pretrained Embeddings
Figure 2 for Improved Large Language Model Jailbreak Detection via Pretrained Embeddings
Figure 3 for Improved Large Language Model Jailbreak Detection via Pretrained Embeddings
Figure 4 for Improved Large Language Model Jailbreak Detection via Pretrained Embeddings
Viaarxiv icon

Towards Type Agnostic Cyber Defense Agents

Add code
Dec 02, 2024
Viaarxiv icon

The Price of Pessimism for Automated Defense

Add code
Sep 28, 2024
Viaarxiv icon

garak: A Framework for Security Probing Large Language Models

Add code
Jun 16, 2024
Figure 1 for garak: A Framework for Security Probing Large Language Models
Figure 2 for garak: A Framework for Security Probing Large Language Models
Figure 3 for garak: A Framework for Security Probing Large Language Models
Figure 4 for garak: A Framework for Security Probing Large Language Models
Viaarxiv icon

AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts

Add code
Apr 09, 2024
Figure 1 for AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts
Figure 2 for AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts
Figure 3 for AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts
Figure 4 for AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts
Viaarxiv icon

Simulation of Attacker Defender Interaction in a Noisy Security Game

Add code
Dec 08, 2022
Viaarxiv icon

Robustness and Usefulness in AI Explanation Methods

Add code
Mar 07, 2022
Figure 1 for Robustness and Usefulness in AI Explanation Methods
Viaarxiv icon

Towards a Responsible AI Development Lifecycle: Lessons From Information Security

Add code
Mar 06, 2022
Figure 1 for Towards a Responsible AI Development Lifecycle: Lessons From Information Security
Viaarxiv icon

Evaluating Attacker Risk Behavior in an Internet of Things Ecosystem

Add code
Sep 23, 2021
Figure 1 for Evaluating Attacker Risk Behavior in an Internet of Things Ecosystem
Figure 2 for Evaluating Attacker Risk Behavior in an Internet of Things Ecosystem
Figure 3 for Evaluating Attacker Risk Behavior in an Internet of Things Ecosystem
Figure 4 for Evaluating Attacker Risk Behavior in an Internet of Things Ecosystem
Viaarxiv icon

Who's Afraid of Thomas Bayes?

Add code
Jul 30, 2021
Figure 1 for Who's Afraid of Thomas Bayes?
Figure 2 for Who's Afraid of Thomas Bayes?
Figure 3 for Who's Afraid of Thomas Bayes?
Figure 4 for Who's Afraid of Thomas Bayes?
Viaarxiv icon