Picture for Gonçalo Paulo

Gonçalo Paulo

Automatically Interpreting Millions of Features in Large Language Models

Add code
Oct 17, 2024
Viaarxiv icon

Does Transformer Interpretability Transfer to RNNs?

Add code
Apr 09, 2024
Viaarxiv icon