Picture for Ling Liu

Ling Liu

Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models

Add code
Oct 11, 2024
Figure 1 for Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models
Figure 2 for Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models
Figure 3 for Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models
Figure 4 for Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models
Viaarxiv icon

LLM-TOPLA: Efficient LLM Ensemble by Maximising Diversity

Add code
Oct 04, 2024
Viaarxiv icon

Beyond correlation: The impact of human uncertainty in measuring the effectiveness of automatic evaluation and LLM-as-a-judge

Add code
Oct 03, 2024
Viaarxiv icon

Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey

Add code
Sep 26, 2024
Viaarxiv icon

Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation

Add code
Sep 04, 2024
Viaarxiv icon

Booster: Tackling Harmful Fine-tuing for Large Language Models via Attenuating Harmful Perturbation

Add code
Sep 03, 2024
Viaarxiv icon

Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning

Add code
Aug 18, 2024
Viaarxiv icon

Personalized Privacy Protection Mask Against Unauthorized Facial Recognition

Add code
Jul 19, 2024
Viaarxiv icon

ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models

Add code
May 28, 2024
Viaarxiv icon

Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning

Add code
May 28, 2024
Viaarxiv icon