Picture for Tianrong Zhang

Tianrong Zhang

PromptFix: Few-shot Backdoor Removal via Adversarial Prompt Tuning

Add code
Jun 06, 2024
Viaarxiv icon

Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization

Add code
May 28, 2024
Viaarxiv icon

WordGame: Efficient & Effective LLM Jailbreak via Simultaneous Obfuscation in Query and Response

Add code
May 22, 2024
Viaarxiv icon

VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-trained Models

Add code
Feb 16, 2024
Viaarxiv icon

VLAttack: Multimodal Adversarial Attacks on Vision-Language Tasks via Pre-trained Models

Add code
Oct 07, 2023
Viaarxiv icon