Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Learning Robust Options by Conditional Value at Risk Optimization

Jun 11, 2019

Takuya Hiraoka, Takahisa Imagawa, Tatsuya Mori, Takashi Onishi, Yoshimasa Tsuruoka

Figure 1 for Learning Robust Options by Conditional Value at Risk Optimization

Figure 2 for Learning Robust Options by Conditional Value at Risk Optimization

Figure 3 for Learning Robust Options by Conditional Value at Risk Optimization

Figure 4 for Learning Robust Options by Conditional Value at Risk Optimization

Share this with someone who'll enjoy it:

Abstract:Options are generally learned by using an inaccurate environment model (or simulator), which contains uncertain model parameters. While there are several methods to learn options that are robust against the uncertainty of model parameters, these methods only consider either the worst case or the average (ordinary) case for learning options. This limited consideration of the cases often produces options that do not work well in the unconsidered case. In this paper, we propose a conditional value at risk (CVaR)-based method to learn options that work well in both the average and worst cases. We extend the CVaR-based policy gradient method proposed by Chow and Ghavamzadeh (2014) to deal with robust Markov decision processes and then apply the extended method to learning robust options. We conduct experiments to evaluate our method in multi-joint robot control tasks (HopperIceBlock, Half-Cheetah, and Walker2D). Experimental results show that our method produces options that 1) give better worst-case performance than the options learned only to minimize the average-case loss, and 2) give better average-case performance than the options learned only to minimize the worst-case loss.

* Video demo: https://drive.google.com/open?id=1xXgSeEa_nNG397ZkIayk3CwYPy_BPy8X

View paper on

Share this with someone who'll enjoy it:

Title:Learning Robust Options by Conditional Value at Risk Optimization

Paper and Code