Picture for Tiansheng Huang

Tiansheng Huang

Targeted Vaccine: Safety Alignment for Large Language Models against Harmful Fine-Tuning via Layer-wise Perturbation

Add code
Oct 13, 2024
Viaarxiv icon

LLM-TOPLA: Efficient LLM Ensemble by Maximising Diversity

Add code
Oct 04, 2024
Viaarxiv icon

Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey

Add code
Sep 26, 2024
Viaarxiv icon

Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation

Add code
Sep 04, 2024
Viaarxiv icon

Booster: Tackling Harmful Fine-tuing for Large Language Models via Attenuating Harmful Perturbation

Add code
Sep 03, 2024
Viaarxiv icon

Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning

Add code
Aug 18, 2024
Viaarxiv icon

Personalized Privacy Protection Mask Against Unauthorized Facial Recognition

Add code
Jul 19, 2024
Viaarxiv icon

Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning

Add code
May 28, 2024
Viaarxiv icon

Robust Few-Shot Ensemble Learning with Focal Diversity-Based Pruning

Add code
Apr 05, 2024
Viaarxiv icon

A Survey on Large Language Model-Based Game Agents

Add code
Apr 02, 2024
Viaarxiv icon