Picture for Amelia F. Hardy

Amelia F. Hardy

ASTPrompter: Weakly Supervised Automated Language Model Red-Teaming to Identify Likely Toxic Prompts

Add code
Jul 12, 2024
Viaarxiv icon