Picture for Bowen Baker

Bowen Baker

Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation

Add code
Mar 14, 2025
Viaarxiv icon

OpenAI o1 System Card

Add code
Dec 21, 2024
Figure 1 for OpenAI o1 System Card
Figure 2 for OpenAI o1 System Card
Figure 3 for OpenAI o1 System Card
Figure 4 for OpenAI o1 System Card
Viaarxiv icon

Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision

Add code
Dec 14, 2023
Figure 1 for Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Figure 2 for Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Figure 3 for Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Figure 4 for Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Viaarxiv icon

Let's Verify Step by Step

Add code
May 31, 2023
Figure 1 for Let's Verify Step by Step
Figure 2 for Let's Verify Step by Step
Figure 3 for Let's Verify Step by Step
Figure 4 for Let's Verify Step by Step
Viaarxiv icon

Video PreTraining : Learning to Act by Watching Unlabeled Online Videos

Add code
Jun 23, 2022
Figure 1 for Video PreTraining : Learning to Act by Watching Unlabeled Online Videos
Figure 2 for Video PreTraining : Learning to Act by Watching Unlabeled Online Videos
Figure 3 for Video PreTraining : Learning to Act by Watching Unlabeled Online Videos
Figure 4 for Video PreTraining : Learning to Act by Watching Unlabeled Online Videos
Viaarxiv icon

Multi-task curriculum learning in a complex, visual, hard-exploration domain: Minecraft

Add code
Jun 28, 2021
Figure 1 for Multi-task curriculum learning in a complex, visual, hard-exploration domain: Minecraft
Figure 2 for Multi-task curriculum learning in a complex, visual, hard-exploration domain: Minecraft
Figure 3 for Multi-task curriculum learning in a complex, visual, hard-exploration domain: Minecraft
Figure 4 for Multi-task curriculum learning in a complex, visual, hard-exploration domain: Minecraft
Viaarxiv icon

Emergent Reciprocity and Team Formation from Randomized Uncertain Social Preferences

Add code
Nov 10, 2020
Figure 1 for Emergent Reciprocity and Team Formation from Randomized Uncertain Social Preferences
Figure 2 for Emergent Reciprocity and Team Formation from Randomized Uncertain Social Preferences
Figure 3 for Emergent Reciprocity and Team Formation from Randomized Uncertain Social Preferences
Figure 4 for Emergent Reciprocity and Team Formation from Randomized Uncertain Social Preferences
Viaarxiv icon

Emergent Tool Use From Multi-Agent Autocurricula

Add code
Sep 17, 2019
Figure 1 for Emergent Tool Use From Multi-Agent Autocurricula
Figure 2 for Emergent Tool Use From Multi-Agent Autocurricula
Figure 3 for Emergent Tool Use From Multi-Agent Autocurricula
Figure 4 for Emergent Tool Use From Multi-Agent Autocurricula
Viaarxiv icon

Learning Dexterous In-Hand Manipulation

Add code
Jan 18, 2019
Figure 1 for Learning Dexterous In-Hand Manipulation
Figure 2 for Learning Dexterous In-Hand Manipulation
Figure 3 for Learning Dexterous In-Hand Manipulation
Figure 4 for Learning Dexterous In-Hand Manipulation
Viaarxiv icon

Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

Add code
Mar 10, 2018
Figure 1 for Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
Figure 2 for Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
Figure 3 for Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
Figure 4 for Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
Viaarxiv icon