Picture for Felix Friedrich

Felix Friedrich

SCAR: Sparse Conditioned Autoencoders for Concept Detection and Steering in LLMs

Add code
Nov 11, 2024
Viaarxiv icon

LLavaGuard: VLM-based Safeguards for Vision Dataset Curation and Safety Assessment

Add code
Jun 07, 2024
Viaarxiv icon

ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming

Add code
Apr 06, 2024
Viaarxiv icon

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Add code
Mar 30, 2024
Viaarxiv icon

Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You

Add code
Jan 31, 2024
Viaarxiv icon

LEDITS++: Limitless Image Editing using Text-to-Image Models

Add code
Nov 28, 2023
Viaarxiv icon

Learning by Self-Explaining

Add code
Sep 15, 2023
Viaarxiv icon

Learning to Intervene on Concept Bottlenecks

Add code
Aug 25, 2023
Viaarxiv icon

Mitigating Inappropriateness in Image Generation: Can there be Value in Reflecting the World's Ugliness?

Add code
May 28, 2023
Viaarxiv icon

MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation

Add code
May 24, 2023
Viaarxiv icon