Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Abram Friesen

Model-Value Inconsistency as a Signal for Epistemic Uncertainty

Dec 08, 2021

Angelos Filos, Eszter Vértes, Zita Marinho, Gregory Farquhar, Diana Borsa, Abram Friesen, Feryal Behbahani, Tom Schaul, André Barreto, Simon Osindero

Figure 1 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty

Figure 2 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty

Figure 3 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty

Figure 4 for Model-Value Inconsistency as a Signal for Epistemic Uncertainty

Abstract:Using a model of the environment and a value function, an agent can construct many estimates of a state's value, by unrolling the model for different lengths and bootstrapping with its value function. Our key insight is that one can treat this set of value estimates as a type of ensemble, which we call an \emph{implicit value ensemble} (IVE). Consequently, the discrepancy between these estimates can be used as a proxy for the agent's epistemic uncertainty; we term this signal \emph{model-value inconsistency} or \emph{self-inconsistency} for short. Unlike prior work which estimates uncertainty by training an ensemble of many models and/or value functions, this approach requires only the single model and value function which are already being learned in most model-based reinforcement learning algorithms. We provide empirical evidence in both tabular and function approximation settings from pixels that self-inconsistency is useful (i) as a signal for exploration, (ii) for acting safely under distribution shifts, and (iii) for robustifying value-based planning with a model.

* The first three authors contributed equally

Via

Access Paper or Ask Questions

Introducing Symmetries to Black Box Meta Reinforcement Learning

Sep 22, 2021

Louis Kirsch, Sebastian Flennerhag, Hado van Hasselt, Abram Friesen, Junhyuk Oh, Yutian Chen

Figure 1 for Introducing Symmetries to Black Box Meta Reinforcement Learning

Figure 2 for Introducing Symmetries to Black Box Meta Reinforcement Learning

Figure 3 for Introducing Symmetries to Black Box Meta Reinforcement Learning

Figure 4 for Introducing Symmetries to Black Box Meta Reinforcement Learning

Abstract:Meta reinforcement learning (RL) attempts to discover new RL algorithms automatically from environment interaction. In so-called black-box approaches, the policy and the learning algorithm are jointly represented by a single neural network. These methods are very flexible, but they tend to underperform in terms of generalisation to new, unseen environments. In this paper, we explore the role of symmetries in meta-generalisation. We show that a recent successful meta RL approach that meta-learns an objective for backpropagation-based learning exhibits certain symmetries (specifically the reuse of the learning rule, and invariance to input and output permutations) that are not present in typical black-box meta RL systems. We hypothesise that these symmetries can play an important role in meta-generalisation. Building off recent work in black-box supervised meta learning, we develop a black-box meta RL system that exhibits these same symmetries. We show through careful experimentation that incorporating these symmetries can lead to algorithms with a greater ability to generalise to unseen action & observation spaces, tasks, and environments.

Via

Access Paper or Ask Questions