Picture for Jonah Brown-Cohen

Jonah Brown-Cohen

On scalable oversight with weak LLMs judging strong LLMs

Add code
Jul 05, 2024
Viaarxiv icon

Scalable AI Safety via Doubly-Efficient Debate

Add code
Nov 23, 2023
Viaarxiv icon

Skill-Mix: a Flexible and Expandable Family of Evaluations for AI models

Add code
Oct 26, 2023
Viaarxiv icon

Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions

Add code
Jun 09, 2023
Figure 1 for Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions
Figure 2 for Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions
Figure 3 for Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions
Figure 4 for Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions
Viaarxiv icon

Faster Algorithms and Constant Lower Bounds for the Worst-Case Expected Error

Add code
Dec 27, 2021
Figure 1 for Faster Algorithms and Constant Lower Bounds for the Worst-Case Expected Error
Figure 2 for Faster Algorithms and Constant Lower Bounds for the Worst-Case Expected Error
Figure 3 for Faster Algorithms and Constant Lower Bounds for the Worst-Case Expected Error
Viaarxiv icon