Picture for Aaquib Syed

Aaquib Syed

Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization

Add code
Oct 16, 2024
Viaarxiv icon

Refusal in Language Models Is Mediated by a Single Direction

Add code
Jun 17, 2024
Viaarxiv icon

Attribution Patching Outperforms Automated Circuit Discovery

Add code
Oct 16, 2023
Viaarxiv icon