Picture for Zhiwen Gui

Zhiwen Gui

Foot In The Door: Understanding Large Language Model Jailbreaking via Cognitive Psychology

Add code
Feb 24, 2024
Viaarxiv icon

Self-Deception: Reverse Penetrating the Semantic Firewall of Large Language Models

Add code
Aug 25, 2023
Viaarxiv icon