Picture for Vlad Firoiu

Vlad Firoiu

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Improving alignment of dialogue agents via targeted human judgements

Add code
Sep 28, 2022
Figure 1 for Improving alignment of dialogue agents via targeted human judgements
Figure 2 for Improving alignment of dialogue agents via targeted human judgements
Figure 3 for Improving alignment of dialogue agents via targeted human judgements
Figure 4 for Improving alignment of dialogue agents via targeted human judgements
Viaarxiv icon

Proving Theorems using Incremental Learning and Hindsight Experience Replay

Add code
Dec 20, 2021
Figure 1 for Proving Theorems using Incremental Learning and Hindsight Experience Replay
Figure 2 for Proving Theorems using Incremental Learning and Hindsight Experience Replay
Figure 3 for Proving Theorems using Incremental Learning and Hindsight Experience Replay
Viaarxiv icon

Training a First-Order Theorem Prover from Synthetic Data

Add code
Mar 05, 2021
Figure 1 for Training a First-Order Theorem Prover from Synthetic Data
Figure 2 for Training a First-Order Theorem Prover from Synthetic Data
Figure 3 for Training a First-Order Theorem Prover from Synthetic Data
Figure 4 for Training a First-Order Theorem Prover from Synthetic Data
Viaarxiv icon

Learning to Prove from Synthetic Theorems

Add code
Jun 19, 2020
Figure 1 for Learning to Prove from Synthetic Theorems
Figure 2 for Learning to Prove from Synthetic Theorems
Figure 3 for Learning to Prove from Synthetic Theorems
Figure 4 for Learning to Prove from Synthetic Theorems
Viaarxiv icon

Automated curricula through setter-solver interactions

Add code
Sep 27, 2019
Figure 1 for Automated curricula through setter-solver interactions
Figure 2 for Automated curricula through setter-solver interactions
Figure 3 for Automated curricula through setter-solver interactions
Figure 4 for Automated curricula through setter-solver interactions
Viaarxiv icon

At Human Speed: Deep Reinforcement Learning with Action Delay

Add code
Oct 16, 2018
Figure 1 for At Human Speed: Deep Reinforcement Learning with Action Delay
Figure 2 for At Human Speed: Deep Reinforcement Learning with Action Delay
Figure 3 for At Human Speed: Deep Reinforcement Learning with Action Delay
Figure 4 for At Human Speed: Deep Reinforcement Learning with Action Delay
Viaarxiv icon

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

Add code
Jun 28, 2018
Figure 1 for IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
Figure 2 for IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
Figure 3 for IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
Figure 4 for IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
Viaarxiv icon

Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning

Add code
May 08, 2017
Figure 1 for Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning
Figure 2 for Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning
Figure 3 for Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning
Figure 4 for Beating the World's Best at Super Smash Bros. with Deep Reinforcement Learning
Viaarxiv icon

Automatic Inference for Inverting Software Simulators via Probabilistic Programming

Add code
May 31, 2015
Figure 1 for Automatic Inference for Inverting Software Simulators via Probabilistic Programming
Figure 2 for Automatic Inference for Inverting Software Simulators via Probabilistic Programming
Figure 3 for Automatic Inference for Inverting Software Simulators via Probabilistic Programming
Figure 4 for Automatic Inference for Inverting Software Simulators via Probabilistic Programming
Viaarxiv icon