Picture for Thomas Marshall

Thomas Marshall

Refusal in LLMs is an Affine Function

Add code
Nov 13, 2024
Viaarxiv icon

Does Transformer Interpretability Transfer to RNNs?

Add code
Apr 09, 2024
Viaarxiv icon