Picture for Yishuo Cai

Yishuo Cai

Course-Correction: Safety Alignment Using Synthetic Preferences

Add code
Jul 23, 2024
Figure 1 for Course-Correction: Safety Alignment Using Synthetic Preferences
Figure 2 for Course-Correction: Safety Alignment Using Synthetic Preferences
Figure 3 for Course-Correction: Safety Alignment Using Synthetic Preferences
Figure 4 for Course-Correction: Safety Alignment Using Synthetic Preferences
Viaarxiv icon

Nearest is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks

Add code
May 21, 2024
Viaarxiv icon