Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Grounding Aleatoric Uncertainty in Unsupervised Environment Design

Jul 11, 2022

Minqi Jiang, Michael Dennis, Jack Parker-Holder, Andrei Lupu, Heinrich Küttler, Edward Grefenstette, Tim Rocktäschel, Jakob Foerster

Figure 1 for Grounding Aleatoric Uncertainty in Unsupervised Environment Design

Figure 2 for Grounding Aleatoric Uncertainty in Unsupervised Environment Design

Figure 3 for Grounding Aleatoric Uncertainty in Unsupervised Environment Design

Figure 4 for Grounding Aleatoric Uncertainty in Unsupervised Environment Design

Share this with someone who'll enjoy it:

Abstract:Adaptive curricula in reinforcement learning (RL) have proven effective for producing policies robust to discrepancies between the train and test environment. Recently, the Unsupervised Environment Design (UED) framework generalized RL curricula to generating sequences of entire environments, leading to new methods with robust minimax regret properties. Problematically, in partially-observable or stochastic settings, optimal policies may depend on the ground-truth distribution over aleatoric parameters of the environment in the intended deployment setting, while curriculum learning necessarily shifts the training distribution. We formalize this phenomenon as curriculum-induced covariate shift (CICS), and describe how its occurrence in aleatoric parameters can lead to suboptimal policies. Directly sampling these parameters from the ground-truth distribution avoids the issue, but thwarts curriculum learning. We propose SAMPLR, a minimax regret UED method that optimizes the ground-truth utility function, even when the underlying training data is biased due to CICS. We prove, and validate on challenging domains, that our approach preserves optimality under the ground-truth distribution, while promoting robustness across the full range of environment settings.

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Grounding Aleatoric Uncertainty in Unsupervised Environment Design

Paper and Code