Picture for Noam Brown

Noam Brown

OpenAI o1 System Card

Add code
Dec 21, 2024
Viaarxiv icon

The Update Equivalence Framework for Decision-Time Planning

Add code
Apr 25, 2023
Figure 1 for The Update Equivalence Framework for Decision-Time Planning
Figure 2 for The Update Equivalence Framework for Decision-Time Planning
Figure 3 for The Update Equivalence Framework for Decision-Time Planning
Figure 4 for The Update Equivalence Framework for Decision-Time Planning
Viaarxiv icon

Abstracting Imperfect Information Away from Two-Player Zero-Sum Games

Add code
Jan 22, 2023
Figure 1 for Abstracting Imperfect Information Away from Two-Player Zero-Sum Games
Figure 2 for Abstracting Imperfect Information Away from Two-Player Zero-Sum Games
Figure 3 for Abstracting Imperfect Information Away from Two-Player Zero-Sum Games
Figure 4 for Abstracting Imperfect Information Away from Two-Player Zero-Sum Games
Viaarxiv icon

Human-AI Coordination via Human-Regularized Search and Learning

Add code
Oct 11, 2022
Figure 1 for Human-AI Coordination via Human-Regularized Search and Learning
Figure 2 for Human-AI Coordination via Human-Regularized Search and Learning
Figure 3 for Human-AI Coordination via Human-Regularized Search and Learning
Viaarxiv icon

Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning

Add code
Oct 11, 2022
Figure 1 for Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning
Figure 2 for Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning
Figure 3 for Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning
Figure 4 for Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning
Viaarxiv icon

A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games

Add code
Jun 12, 2022
Figure 1 for A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games
Figure 2 for A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games
Figure 3 for A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games
Figure 4 for A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games
Viaarxiv icon

Modeling Strong and Human-Like Gameplay with KL-Regularized Search

Add code
Dec 14, 2021
Figure 1 for Modeling Strong and Human-Like Gameplay with KL-Regularized Search
Figure 2 for Modeling Strong and Human-Like Gameplay with KL-Regularized Search
Figure 3 for Modeling Strong and Human-Like Gameplay with KL-Regularized Search
Figure 4 for Modeling Strong and Human-Like Gameplay with KL-Regularized Search
Viaarxiv icon

No-Press Diplomacy from Scratch

Add code
Oct 06, 2021
Figure 1 for No-Press Diplomacy from Scratch
Figure 2 for No-Press Diplomacy from Scratch
Figure 3 for No-Press Diplomacy from Scratch
Figure 4 for No-Press Diplomacy from Scratch
Viaarxiv icon

Scalable Online Planning via Reinforcement Learning Fine-Tuning

Add code
Sep 30, 2021
Figure 1 for Scalable Online Planning via Reinforcement Learning Fine-Tuning
Figure 2 for Scalable Online Planning via Reinforcement Learning Fine-Tuning
Figure 3 for Scalable Online Planning via Reinforcement Learning Fine-Tuning
Figure 4 for Scalable Online Planning via Reinforcement Learning Fine-Tuning
Viaarxiv icon

Learned Belief Search: Efficiently Improving Policies in Partially Observable Settings

Add code
Jun 16, 2021
Figure 1 for Learned Belief Search: Efficiently Improving Policies in Partially Observable Settings
Figure 2 for Learned Belief Search: Efficiently Improving Policies in Partially Observable Settings
Figure 3 for Learned Belief Search: Efficiently Improving Policies in Partially Observable Settings
Figure 4 for Learned Belief Search: Efficiently Improving Policies in Partially Observable Settings
Viaarxiv icon