Picture for Daniel Kokotajlo

Daniel Kokotajlo

Towards evaluations-based safety cases for AI scheming

Add code
Nov 07, 2024
Viaarxiv icon

Taken out of context: On measuring situational awareness in LLMs

Add code
Sep 01, 2023
Viaarxiv icon

Model evaluation for extreme risks

Add code
May 24, 2023
Figure 1 for Model evaluation for extreme risks
Figure 2 for Model evaluation for extreme risks
Figure 3 for Model evaluation for extreme risks
Figure 4 for Model evaluation for extreme risks
Viaarxiv icon