Picture for Alexander Boyd

Alexander Boyd

AI in a vat: Fundamental limits of efficient world modelling for agent sandboxing and interpretability

Add code
Apr 06, 2025
Viaarxiv icon