Picture for Minhao Cheng

Minhao Cheng

Uncovering, Explaining, and Mitigating the Superficial Safety of Backdoor Defense

Add code
Oct 13, 2024
Viaarxiv icon

Defense Against Syntactic Textual Backdoor Attacks with Token Substitution

Add code
Jul 04, 2024
Figure 1 for Defense Against Syntactic Textual Backdoor Attacks with Token Substitution
Figure 2 for Defense Against Syntactic Textual Backdoor Attacks with Token Substitution
Figure 3 for Defense Against Syntactic Textual Backdoor Attacks with Token Substitution
Figure 4 for Defense Against Syntactic Textual Backdoor Attacks with Token Substitution
Viaarxiv icon

One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts

Add code
Jun 28, 2024
Figure 1 for One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
Figure 2 for One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
Figure 3 for One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
Figure 4 for One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
Viaarxiv icon

MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?

Add code
Jun 22, 2024
Viaarxiv icon

Understanding the Impact of Negative Prompts: When and How Do They Take Effect?

Add code
Jun 05, 2024
Viaarxiv icon

The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise

Add code
Jun 04, 2024
Viaarxiv icon

Invisible Backdoor Attacks on Diffusion Models

Add code
Jun 02, 2024
Viaarxiv icon

Input Snapshots Fusion for Scalable Discrete Dynamic Graph Nerual Networks

Add code
May 11, 2024
Viaarxiv icon

DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers

Add code
Mar 01, 2024
Viaarxiv icon

Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning

Add code
Feb 24, 2024
Viaarxiv icon