Picture for Connor Kissane

Connor Kissane

Interpreting Attention Layer Outputs with Sparse Autoencoders

Add code
Jun 25, 2024
Viaarxiv icon