Picture for Chengzhi Mao

Chengzhi Mao

Diversity Helps Jailbreak Large Language Models

Add code
Nov 06, 2024
Figure 1 for Diversity Helps Jailbreak Large Language Models
Figure 2 for Diversity Helps Jailbreak Large Language Models
Figure 3 for Diversity Helps Jailbreak Large Language Models
Figure 4 for Diversity Helps Jailbreak Large Language Models
Viaarxiv icon

I Can Hear You: Selective Robust Training for Deepfake Audio Detection

Add code
Oct 31, 2024
Viaarxiv icon

SPIN: Self-Supervised Prompt INjection

Add code
Oct 17, 2024
Figure 1 for SPIN: Self-Supervised Prompt INjection
Figure 2 for SPIN: Self-Supervised Prompt INjection
Figure 3 for SPIN: Self-Supervised Prompt INjection
Figure 4 for SPIN: Self-Supervised Prompt INjection
Viaarxiv icon

RAFT: Realistic Attacks to Fool Text Detectors

Add code
Oct 04, 2024
Figure 1 for RAFT: Realistic Attacks to Fool Text Detectors
Figure 2 for RAFT: Realistic Attacks to Fool Text Detectors
Figure 3 for RAFT: Realistic Attacks to Fool Text Detectors
Figure 4 for RAFT: Realistic Attacks to Fool Text Detectors
Viaarxiv icon

Learning to Rewrite: Generalized LLM-Generated Text Detection

Add code
Aug 08, 2024
Viaarxiv icon

Turns Out I'm Not Real: Towards Robust Detection of AI-Generated Videos

Add code
Jun 13, 2024
Viaarxiv icon

ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object

Add code
Mar 27, 2024
Figure 1 for ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object
Figure 2 for ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object
Figure 3 for ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object
Figure 4 for ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object
Viaarxiv icon

SelfIE: Self-Interpretation of Large Language Model Embeddings

Add code
Mar 26, 2024
Viaarxiv icon

Raidar: geneRative AI Detection viA Rewriting

Add code
Jan 23, 2024
Viaarxiv icon

Robustifying Language Models with Test-Time Adaptation

Add code
Oct 29, 2023
Viaarxiv icon