Picture for Heegyu Kim

Heegyu Kim

Break the Breakout: Reinventing LM Defense Against Jailbreak Attacks with Self-Refinement

Add code
Feb 27, 2024
Viaarxiv icon

GTA: Gated Toxicity Avoidance for LM Performance Preservation

Add code
Dec 11, 2023
Viaarxiv icon