Picture for Matt Hoffman

Matt Hoffman

Columbia University

Gemma 2: Improving Open Language Models at a Practical Size

Add code
Aug 02, 2024
Figure 1 for Gemma 2: Improving Open Language Models at a Practical Size
Figure 2 for Gemma 2: Improving Open Language Models at a Practical Size
Figure 3 for Gemma 2: Improving Open Language Models at a Practical Size
Figure 4 for Gemma 2: Improving Open Language Models at a Practical Size
Viaarxiv icon

BOND: Aligning LLMs with Best-of-N Distillation

Add code
Jul 19, 2024
Figure 1 for BOND: Aligning LLMs with Best-of-N Distillation
Figure 2 for BOND: Aligning LLMs with Best-of-N Distillation
Figure 3 for BOND: Aligning LLMs with Best-of-N Distillation
Figure 4 for BOND: Aligning LLMs with Best-of-N Distillation
Viaarxiv icon

An Empirical Study of Implicit Regularization in Deep Offline RL

Add code
Jul 07, 2022
Figure 1 for An Empirical Study of Implicit Regularization in Deep Offline RL
Figure 2 for An Empirical Study of Implicit Regularization in Deep Offline RL
Figure 3 for An Empirical Study of Implicit Regularization in Deep Offline RL
Figure 4 for An Empirical Study of Implicit Regularization in Deep Offline RL
Viaarxiv icon

Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach

Add code
Apr 22, 2022
Figure 1 for Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach
Figure 2 for Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach
Figure 3 for Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach
Figure 4 for Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach
Viaarxiv icon

RL Unplugged: Benchmarks for Offline Reinforcement Learning

Add code
Jul 02, 2020
Figure 1 for RL Unplugged: Benchmarks for Offline Reinforcement Learning
Figure 2 for RL Unplugged: Benchmarks for Offline Reinforcement Learning
Figure 3 for RL Unplugged: Benchmarks for Offline Reinforcement Learning
Figure 4 for RL Unplugged: Benchmarks for Offline Reinforcement Learning
Viaarxiv icon

Acme: A Research Framework for Distributed Reinforcement Learning

Add code
Jun 01, 2020
Figure 1 for Acme: A Research Framework for Distributed Reinforcement Learning
Figure 2 for Acme: A Research Framework for Distributed Reinforcement Learning
Figure 3 for Acme: A Research Framework for Distributed Reinforcement Learning
Figure 4 for Acme: A Research Framework for Distributed Reinforcement Learning
Viaarxiv icon

Improving the Gating Mechanism of Recurrent Neural Networks

Add code
Oct 22, 2019
Figure 1 for Improving the Gating Mechanism of Recurrent Neural Networks
Figure 2 for Improving the Gating Mechanism of Recurrent Neural Networks
Figure 3 for Improving the Gating Mechanism of Recurrent Neural Networks
Figure 4 for Improving the Gating Mechanism of Recurrent Neural Networks
Viaarxiv icon

Making Efficient Use of Demonstrations to Solve Hard Exploration Problems

Add code
Sep 03, 2019
Figure 1 for Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
Figure 2 for Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
Figure 3 for Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
Figure 4 for Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
Viaarxiv icon

TensorFlow Distributions

Add code
Nov 28, 2017
Figure 1 for TensorFlow Distributions
Figure 2 for TensorFlow Distributions
Figure 3 for TensorFlow Distributions
Viaarxiv icon

Celeste: Variational inference for a generative model of astronomical images

Add code
Jun 03, 2015
Figure 1 for Celeste: Variational inference for a generative model of astronomical images
Figure 2 for Celeste: Variational inference for a generative model of astronomical images
Figure 3 for Celeste: Variational inference for a generative model of astronomical images
Figure 4 for Celeste: Variational inference for a generative model of astronomical images
Viaarxiv icon