Picture for Savvas Zannettou

Savvas Zannettou

Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification

Add code
Jul 30, 2024
Viaarxiv icon

UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images

Add code
May 06, 2024
Viaarxiv icon

A Comprehensive View of the Biases of Toxicity and Sentiment Analysis Methods Towards Utterances with African American English Expressions

Add code
Jan 23, 2024
Viaarxiv icon

You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content

Add code
Aug 10, 2023
Viaarxiv icon

Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models

Add code
May 23, 2023
Viaarxiv icon

On the Evolution of Memes by Means of Multimodal Contrastive Learning

Add code
Dec 13, 2022
Viaarxiv icon

Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots

Add code
Sep 09, 2022
Figure 1 for Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots
Figure 2 for Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots
Figure 3 for Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots
Figure 4 for Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots
Viaarxiv icon

Feels Bad Man: Dissecting Automated Hateful Meme Detection Through the Lens of Facebook's Challenge

Add code
Feb 17, 2022
Figure 1 for Feels Bad Man: Dissecting Automated Hateful Meme Detection Through the Lens of Facebook's Challenge
Figure 2 for Feels Bad Man: Dissecting Automated Hateful Meme Detection Through the Lens of Facebook's Challenge
Figure 3 for Feels Bad Man: Dissecting Automated Hateful Meme Detection Through the Lens of Facebook's Challenge
Figure 4 for Feels Bad Man: Dissecting Automated Hateful Meme Detection Through the Lens of Facebook's Challenge
Viaarxiv icon