Picture for Gregory Farquhar

Gregory Farquhar

Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design

Add code
Oct 04, 2023
Viaarxiv icon

An Investigation of the Bias-Variance Tradeoff in Meta-Gradients

Add code
Sep 22, 2022
Figure 1 for An Investigation of the Bias-Variance Tradeoff in Meta-Gradients
Figure 2 for An Investigation of the Bias-Variance Tradeoff in Meta-Gradients
Figure 3 for An Investigation of the Bias-Variance Tradeoff in Meta-Gradients
Figure 4 for An Investigation of the Bias-Variance Tradeoff in Meta-Gradients
Viaarxiv icon

Model-Value Inconsistency as a Signal for Epistemic Uncertainty

Add code
Dec 08, 2021
Figure 1 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Figure 2 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Figure 3 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Figure 4 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty
Viaarxiv icon

Self-Consistent Models and Values

Add code
Oct 25, 2021
Figure 1 for Self-Consistent Models and Values
Figure 2 for Self-Consistent Models and Values
Figure 3 for Self-Consistent Models and Values
Figure 4 for Self-Consistent Models and Values
Viaarxiv icon

Proper Value Equivalence

Add code
Jun 18, 2021
Figure 1 for Proper Value Equivalence
Figure 2 for Proper Value Equivalence
Figure 3 for Proper Value Equivalence
Figure 4 for Proper Value Equivalence
Viaarxiv icon

PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning

Add code
Feb 24, 2021
Figure 1 for PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning
Figure 2 for PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning
Figure 3 for PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning
Figure 4 for PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning
Viaarxiv icon

Weighted QMIX: Expanding Monotonic Value Function Factorisation

Add code
Jun 18, 2020
Figure 1 for Weighted QMIX: Expanding Monotonic Value Function Factorisation
Figure 2 for Weighted QMIX: Expanding Monotonic Value Function Factorisation
Figure 3 for Weighted QMIX: Expanding Monotonic Value Function Factorisation
Figure 4 for Weighted QMIX: Expanding Monotonic Value Function Factorisation
Viaarxiv icon

The Impact of Non-stationarity on Generalisation in Deep Reinforcement Learning

Add code
Jun 16, 2020
Figure 1 for The Impact of Non-stationarity on Generalisation in Deep Reinforcement Learning
Figure 2 for The Impact of Non-stationarity on Generalisation in Deep Reinforcement Learning
Figure 3 for The Impact of Non-stationarity on Generalisation in Deep Reinforcement Learning
Figure 4 for The Impact of Non-stationarity on Generalisation in Deep Reinforcement Learning
Viaarxiv icon

Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Add code
Mar 19, 2020
Figure 1 for Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Figure 2 for Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Figure 3 for Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Figure 4 for Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Viaarxiv icon

Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning

Add code
Sep 23, 2019
Figure 1 for Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning
Figure 2 for Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning
Figure 3 for Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning
Figure 4 for Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning
Viaarxiv icon