Picture for Mengxin Zheng

Mengxin Zheng

BadFair: Backdoored Fairness Attacks with Group-conditioned Triggers

Add code
Oct 23, 2024
Viaarxiv icon

Jailbreaking LLMs with Arabic Transliteration and Arabizi

Add code
Jun 26, 2024
Viaarxiv icon

CR-UTP: Certified Robustness against Universal Text Perturbations

Add code
Jun 04, 2024
Figure 1 for CR-UTP: Certified Robustness against Universal Text Perturbations
Figure 2 for CR-UTP: Certified Robustness against Universal Text Perturbations
Figure 3 for CR-UTP: Certified Robustness against Universal Text Perturbations
Figure 4 for CR-UTP: Certified Robustness against Universal Text Perturbations
Viaarxiv icon

BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models

Add code
Jun 03, 2024
Figure 1 for BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models
Figure 2 for BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models
Figure 3 for BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models
Figure 4 for BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models
Viaarxiv icon

TrojFSP: Trojan Insertion in Few-shot Prompt Tuning

Add code
Dec 16, 2023
Figure 1 for TrojFSP: Trojan Insertion in Few-shot Prompt Tuning
Figure 2 for TrojFSP: Trojan Insertion in Few-shot Prompt Tuning
Figure 3 for TrojFSP: Trojan Insertion in Few-shot Prompt Tuning
Figure 4 for TrojFSP: Trojan Insertion in Few-shot Prompt Tuning
Viaarxiv icon

TrojFair: Trojan Fairness Attacks

Add code
Dec 16, 2023
Viaarxiv icon

TrojPrompt: A Black-box Trojan Attack on Pre-trained Language Models

Add code
Jun 12, 2023
Figure 1 for TrojPrompt: A Black-box Trojan Attack on Pre-trained Language Models
Figure 2 for TrojPrompt: A Black-box Trojan Attack on Pre-trained Language Models
Figure 3 for TrojPrompt: A Black-box Trojan Attack on Pre-trained Language Models
Figure 4 for TrojPrompt: A Black-box Trojan Attack on Pre-trained Language Models
Viaarxiv icon

SSL-Cleanse: Trojan Detection and Mitigation in Self-Supervised Learning

Add code
Mar 16, 2023
Figure 1 for SSL-Cleanse: Trojan Detection and Mitigation in Self-Supervised Learning
Figure 2 for SSL-Cleanse: Trojan Detection and Mitigation in Self-Supervised Learning
Figure 3 for SSL-Cleanse: Trojan Detection and Mitigation in Self-Supervised Learning
Figure 4 for SSL-Cleanse: Trojan Detection and Mitigation in Self-Supervised Learning
Viaarxiv icon

TrojViT: Trojan Insertion in Vision Transformers

Add code
Aug 27, 2022
Figure 1 for TrojViT: Trojan Insertion in Vision Transformers
Figure 2 for TrojViT: Trojan Insertion in Vision Transformers
Figure 3 for TrojViT: Trojan Insertion in Vision Transformers
Figure 4 for TrojViT: Trojan Insertion in Vision Transformers
Viaarxiv icon