Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:SOAC: The Soft Option Actor-Critic Architecture

Jun 25, 2020

Chenghao Li, Xiaoteng Ma, Chongjie Zhang, Jun Yang, Li Xia, Qianchuan Zhao

Figure 1 for SOAC: The Soft Option Actor-Critic Architecture

Figure 2 for SOAC: The Soft Option Actor-Critic Architecture

Figure 3 for SOAC: The Soft Option Actor-Critic Architecture

Figure 4 for SOAC: The Soft Option Actor-Critic Architecture

Share this with someone who'll enjoy it:

Abstract:The option framework has shown great promise by automatically extracting temporally-extended sub-tasks from a long-horizon task. Methods have been proposed for concurrently learning low-level intra-option policies and high-level option selection policy. However, existing methods typically suffer from two major challenges: ineffective exploration and unstable updates. In this paper, we present a novel and stable off-policy approach that builds on the maximum entropy model to address these challenges. Our approach introduces an information-theoretical intrinsic reward for encouraging the identification of diverse and effective options. Meanwhile, we utilize a probability inference model to simplify the optimization problem as fitting optimal trajectories. Experimental results demonstrate that our approach significantly outperforms prior on-policy and off-policy methods in a range of Mujoco benchmark tasks while still providing benefits for transfer learning. In these tasks, our approach learns a diverse set of options, each of whose state-action space has strong coherence.

View paper on

Share this with someone who'll enjoy it:

Title:SOAC: The Soft Option Actor-Critic Architecture

Paper and Code