Picture for Alex Gibson

Alex Gibson

Compact Proofs of Model Performance via Mechanistic Interpretability

Add code
Jun 24, 2024
Viaarxiv icon

Provable Guarantees for Model Performance via Mechanistic Interpretability

Add code
Jun 18, 2024
Viaarxiv icon