Picture for Fengqing Jiang

Fengqing Jiang

Stronger Models are NOT Stronger Teachers for Instruction Tuning

Add code
Nov 12, 2024
Viaarxiv icon

CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models

Add code
Jun 18, 2024
Viaarxiv icon

ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates

Add code
Jun 17, 2024
Viaarxiv icon

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Add code
Jun 12, 2024
Viaarxiv icon

ACE: A Model Poisoning Attack on Contribution Evaluation Methods in Federated Learning

Add code
May 31, 2024
Viaarxiv icon

SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding

Add code
Feb 24, 2024
Viaarxiv icon

ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs

Add code
Feb 22, 2024
Viaarxiv icon

BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models

Add code
Jan 20, 2024
Viaarxiv icon

Brave: Byzantine-Resilient and Privacy-Preserving Peer-to-Peer Federated Learning

Add code
Jan 10, 2024
Viaarxiv icon

MDTD: A Multi Domain Trojan Detector for Deep Neural Networks

Add code
Sep 03, 2023
Viaarxiv icon