Picture for Minhao Cheng

Minhao Cheng

Leveraging Reasoning with Guidelines to Elicit and Utilize Knowledge for Enhancing Safety Alignment

Add code
Feb 06, 2025
Viaarxiv icon

Improving Your Model Ranking on Chatbot Arena by Vote Rigging

Add code
Jan 29, 2025
Viaarxiv icon

Uncovering, Explaining, and Mitigating the Superficial Safety of Backdoor Defense

Add code
Oct 13, 2024
Figure 1 for Uncovering, Explaining, and Mitigating the Superficial Safety of Backdoor Defense
Figure 2 for Uncovering, Explaining, and Mitigating the Superficial Safety of Backdoor Defense
Figure 3 for Uncovering, Explaining, and Mitigating the Superficial Safety of Backdoor Defense
Figure 4 for Uncovering, Explaining, and Mitigating the Superficial Safety of Backdoor Defense
Viaarxiv icon

Defense Against Syntactic Textual Backdoor Attacks with Token Substitution

Add code
Jul 04, 2024
Figure 1 for Defense Against Syntactic Textual Backdoor Attacks with Token Substitution
Figure 2 for Defense Against Syntactic Textual Backdoor Attacks with Token Substitution
Figure 3 for Defense Against Syntactic Textual Backdoor Attacks with Token Substitution
Figure 4 for Defense Against Syntactic Textual Backdoor Attacks with Token Substitution
Viaarxiv icon

One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts

Add code
Jun 28, 2024
Figure 1 for One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
Figure 2 for One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
Figure 3 for One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
Figure 4 for One Prompt is not Enough: Automated Construction of a Mixture-of-Expert Prompts
Viaarxiv icon

MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?

Add code
Jun 22, 2024
Figure 1 for MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?
Figure 2 for MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?
Figure 3 for MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?
Figure 4 for MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?
Viaarxiv icon

Understanding the Impact of Negative Prompts: When and How Do They Take Effect?

Add code
Jun 05, 2024
Figure 1 for Understanding the Impact of Negative Prompts: When and How Do They Take Effect?
Figure 2 for Understanding the Impact of Negative Prompts: When and How Do They Take Effect?
Figure 3 for Understanding the Impact of Negative Prompts: When and How Do They Take Effect?
Figure 4 for Understanding the Impact of Negative Prompts: When and How Do They Take Effect?
Viaarxiv icon

The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise

Add code
Jun 04, 2024
Figure 1 for The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise
Figure 2 for The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise
Figure 3 for The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise
Figure 4 for The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise
Viaarxiv icon

Invisible Backdoor Attacks on Diffusion Models

Add code
Jun 02, 2024
Viaarxiv icon

Input Snapshots Fusion for Scalable Discrete Dynamic Graph Nerual Networks

Add code
May 11, 2024
Figure 1 for Input Snapshots Fusion for Scalable Discrete Dynamic Graph Nerual Networks
Figure 2 for Input Snapshots Fusion for Scalable Discrete Dynamic Graph Nerual Networks
Figure 3 for Input Snapshots Fusion for Scalable Discrete Dynamic Graph Nerual Networks
Figure 4 for Input Snapshots Fusion for Scalable Discrete Dynamic Graph Nerual Networks
Viaarxiv icon