Picture for Kaarel Hänni

Kaarel Hänni

Mathematical Models of Computation in Superposition

Add code
Aug 10, 2024
Viaarxiv icon

Cluster-norm for Unsupervised Probing of Knowledge

Add code
Jul 26, 2024
Viaarxiv icon

The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks

Add code
May 17, 2024
Viaarxiv icon

Using Degeneracy in the Loss Landscape for Mechanistic Interpretability

Add code
May 17, 2024
Viaarxiv icon