Picture for Eoin Farrell

Eoin Farrell

SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability

Add code
Mar 13, 2025
Viaarxiv icon

Applying sparse autoencoders to unlearn knowledge in language models

Add code
Oct 25, 2024
Viaarxiv icon