Picture for Michael T. Pearce

Michael T. Pearce

Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations with MDL-SAEs

Add code
Oct 15, 2024
Viaarxiv icon

Bilinear MLPs enable weight-based mechanistic interpretability

Add code
Oct 10, 2024
Viaarxiv icon

Weight-based Decomposition: A Case for Bilinear MLPs

Add code
Jun 06, 2024
Viaarxiv icon