The dark deep side of DeepSeek: Fine-tuning attacks against the safety alignment of CoT-enabled models

Add code
Feb 03, 2025

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: