Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Discovering Hierarchical Achievements in Reinforcement Learning via Contrastive Learning

Jul 07, 2023

Seungyong Moon, Junyoung Yeom, Bumsoo Park, Hyun Oh Song

Figure 1 for Discovering Hierarchical Achievements in Reinforcement Learning via Contrastive Learning

Figure 2 for Discovering Hierarchical Achievements in Reinforcement Learning via Contrastive Learning

Figure 3 for Discovering Hierarchical Achievements in Reinforcement Learning via Contrastive Learning

Figure 4 for Discovering Hierarchical Achievements in Reinforcement Learning via Contrastive Learning

Share this with someone who'll enjoy it:

Abstract:Discovering achievements with a hierarchical structure on procedurally generated environments poses a significant challenge. This requires agents to possess a broad range of abilities, including generalization and long-term reasoning. Many prior methods are built upon model-based or hierarchical approaches, with the belief that an explicit module for long-term planning would be beneficial for learning hierarchical achievements. However, these methods require an excessive amount of environment interactions or large model sizes, limiting their practicality. In this work, we identify that proximal policy optimization (PPO), a simple and versatile model-free algorithm, outperforms the prior methods with recent implementation practices. Moreover, we find that the PPO agent can predict the next achievement to be unlocked to some extent, though with low confidence. Based on this observation, we propose a novel contrastive learning method, called achievement distillation, that strengthens the agent's capability to predict the next achievement. Our method exhibits a strong capacity for discovering hierarchical achievements and shows state-of-the-art performance on the challenging Crafter environment using fewer model parameters in a sample-efficient regime.

View paper on

Share this with someone who'll enjoy it:

Title:Discovering Hierarchical Achievements in Reinforcement Learning via Contrastive Learning

Paper and Code