Picture for Thomas Klein

Thomas Klein

How Aligned are Different Alignment Metrics?

Add code
Jul 10, 2024
Viaarxiv icon

Scale Alone Does not Improve Mechanistic Interpretability in Vision Models

Add code
Jul 11, 2023
Viaarxiv icon