Picture for DJ Strouse

DJ Strouse

HARP: A challenging human-annotated math reasoning benchmark

Add code
Dec 11, 2024
Viaarxiv icon

Tokenization counts: the impact of tokenization on arithmetic in frontier LLMs

Add code
Feb 22, 2024
Viaarxiv icon

Confronting Reward Model Overoptimization with Constrained RLHF

Add code
Oct 10, 2023
Figure 1 for Confronting Reward Model Overoptimization with Constrained RLHF
Figure 2 for Confronting Reward Model Overoptimization with Constrained RLHF
Figure 3 for Confronting Reward Model Overoptimization with Constrained RLHF
Figure 4 for Confronting Reward Model Overoptimization with Constrained RLHF
Viaarxiv icon

Melting Pot 2.0

Add code
Dec 13, 2022
Viaarxiv icon

In-context Reinforcement Learning with Algorithm Distillation

Add code
Oct 25, 2022
Figure 1 for In-context Reinforcement Learning with Algorithm Distillation
Figure 2 for In-context Reinforcement Learning with Algorithm Distillation
Figure 3 for In-context Reinforcement Learning with Algorithm Distillation
Figure 4 for In-context Reinforcement Learning with Algorithm Distillation
Viaarxiv icon

Semantic Exploration from Language Abstractions and Pretrained Representations

Add code
Apr 08, 2022
Figure 1 for Semantic Exploration from Language Abstractions and Pretrained Representations
Figure 2 for Semantic Exploration from Language Abstractions and Pretrained Representations
Figure 3 for Semantic Exploration from Language Abstractions and Pretrained Representations
Figure 4 for Semantic Exploration from Language Abstractions and Pretrained Representations
Viaarxiv icon

Collaborating with Humans without Human Data

Add code
Oct 15, 2021
Figure 1 for Collaborating with Humans without Human Data
Figure 2 for Collaborating with Humans without Human Data
Figure 3 for Collaborating with Humans without Human Data
Figure 4 for Collaborating with Humans without Human Data
Viaarxiv icon

Learning more skills through optimistic exploration

Add code
Jul 29, 2021
Figure 1 for Learning more skills through optimistic exploration
Figure 2 for Learning more skills through optimistic exploration
Figure 3 for Learning more skills through optimistic exploration
Figure 4 for Learning more skills through optimistic exploration
Viaarxiv icon

A Neural Architecture for Designing Truthful and Efficient Auctions

Add code
Jul 11, 2019
Figure 1 for A Neural Architecture for Designing Truthful and Efficient Auctions
Figure 2 for A Neural Architecture for Designing Truthful and Efficient Auctions
Figure 3 for A Neural Architecture for Designing Truthful and Efficient Auctions
Viaarxiv icon

Intrinsic Social Motivation via Causal Influence in Multi-Agent RL

Add code
Oct 19, 2018
Figure 1 for Intrinsic Social Motivation via Causal Influence in Multi-Agent RL
Figure 2 for Intrinsic Social Motivation via Causal Influence in Multi-Agent RL
Figure 3 for Intrinsic Social Motivation via Causal Influence in Multi-Agent RL
Figure 4 for Intrinsic Social Motivation via Causal Influence in Multi-Agent RL
Viaarxiv icon