Picture for Bobak Shahriari

Bobak Shahriari

Preference Optimization as Probabilistic Inference

Add code
Oct 05, 2024
Viaarxiv icon

Gemma 2: Improving Open Language Models at a Practical Size

Add code
Aug 02, 2024
Figure 1 for Gemma 2: Improving Open Language Models at a Practical Size
Figure 2 for Gemma 2: Improving Open Language Models at a Practical Size
Figure 3 for Gemma 2: Improving Open Language Models at a Practical Size
Figure 4 for Gemma 2: Improving Open Language Models at a Practical Size
Viaarxiv icon

Gemma: Open Models Based on Gemini Research and Technology

Add code
Mar 13, 2024
Figure 1 for Gemma: Open Models Based on Gemini Research and Technology
Figure 2 for Gemma: Open Models Based on Gemini Research and Technology
Figure 3 for Gemma: Open Models Based on Gemini Research and Technology
Figure 4 for Gemma: Open Models Based on Gemini Research and Technology
Viaarxiv icon

Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning

Add code
May 09, 2023
Viaarxiv icon

Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach

Add code
Apr 22, 2022
Figure 1 for Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach
Figure 2 for Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach
Figure 3 for Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach
Figure 4 for Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach
Viaarxiv icon

On Multi-objective Policy Optimization as a Tool for Reinforcement Learning

Add code
Jun 15, 2021
Figure 1 for On Multi-objective Policy Optimization as a Tool for Reinforcement Learning
Figure 2 for On Multi-objective Policy Optimization as a Tool for Reinforcement Learning
Figure 3 for On Multi-objective Policy Optimization as a Tool for Reinforcement Learning
Figure 4 for On Multi-objective Policy Optimization as a Tool for Reinforcement Learning
Viaarxiv icon

Critic Regularized Regression

Add code
Jun 26, 2020
Figure 1 for Critic Regularized Regression
Figure 2 for Critic Regularized Regression
Figure 3 for Critic Regularized Regression
Figure 4 for Critic Regularized Regression
Viaarxiv icon

Acme: A Research Framework for Distributed Reinforcement Learning

Add code
Jun 01, 2020
Figure 1 for Acme: A Research Framework for Distributed Reinforcement Learning
Figure 2 for Acme: A Research Framework for Distributed Reinforcement Learning
Figure 3 for Acme: A Research Framework for Distributed Reinforcement Learning
Figure 4 for Acme: A Research Framework for Distributed Reinforcement Learning
Viaarxiv icon

Making Efficient Use of Demonstrations to Solve Hard Exploration Problems

Add code
Sep 03, 2019
Figure 1 for Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
Figure 2 for Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
Figure 3 for Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
Figure 4 for Making Efficient Use of Demonstrations to Solve Hard Exploration Problems
Viaarxiv icon

Which Learning Algorithms Can Generalize Identity-Based Rules to Novel Inputs?

Add code
May 12, 2016
Figure 1 for Which Learning Algorithms Can Generalize Identity-Based Rules to Novel Inputs?
Figure 2 for Which Learning Algorithms Can Generalize Identity-Based Rules to Novel Inputs?
Figure 3 for Which Learning Algorithms Can Generalize Identity-Based Rules to Novel Inputs?
Viaarxiv icon