Picture for Euan Ong

Euan Ong

Compact Proofs of Model Performance via Mechanistic Interpretability

Add code
Jun 24, 2024
Viaarxiv icon

Provable Guarantees for Model Performance via Mechanistic Interpretability

Add code
Jun 18, 2024
Viaarxiv icon

Successor Heads: Recurring, Interpretable Attention Heads In The Wild

Add code
Dec 14, 2023
Viaarxiv icon

Image Hijacks: Adversarial Images can Control Generative Models at Runtime

Add code
Sep 18, 2023
Viaarxiv icon

Learnable Commutative Monoids for Graph Neural Networks

Add code
Dec 16, 2022
Viaarxiv icon