Picture for Adam Karvonen

Adam Karvonen

Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models

Add code
Jul 31, 2024
Viaarxiv icon

Emergent World Models and Latent Variable Estimation in Chess-Playing Language Models

Add code
Mar 21, 2024
Viaarxiv icon