Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hasham Burhani

Information Content Exploration

Oct 10, 2023

Jacob Chmura, Hasham Burhani, Xiao Qi Shi

Abstract:Sparse reward environments are known to be challenging for reinforcement learning agents. In such environments, efficient and scalable exploration is crucial. Exploration is a means by which an agent gains information about the environment. We expand on this topic and propose a new intrinsic reward that systemically quantifies exploratory behavior and promotes state coverage by maximizing the information content of a trajectory taken by an agent. We compare our method to alternative exploration based intrinsic reward techniques, namely Curiosity Driven Learning and Random Network Distillation. We show that our information theoretic reward induces efficient exploration and outperforms in various games, including Montezuma Revenge, a known difficult task for reinforcement learning. Finally, we propose an extension that maximizes information content in a discretely compressed latent space which boosts sample efficiency and generalizes to continuous state spaces.

* 12 pages, 12 figures

Via

Access Paper or Ask Questions

Scope Loss for Imbalanced Classification and RL Exploration

Aug 08, 2023

Hasham Burhani, Xiao Qi Shi, Jonathan Jaegerman, Daniel Balicki

Figure 1 for Scope Loss for Imbalanced Classification and RL Exploration

Figure 2 for Scope Loss for Imbalanced Classification and RL Exploration

Figure 3 for Scope Loss for Imbalanced Classification and RL Exploration

Figure 4 for Scope Loss for Imbalanced Classification and RL Exploration

Abstract:We demonstrate equivalence between the reinforcement learning problem and the supervised classification problem. We consequently equate the exploration exploitation trade-off in reinforcement learning to the dataset imbalance problem in supervised classification, and find similarities in how they are addressed. From our analysis of the aforementioned problems we derive a novel loss function for reinforcement learning and supervised classification. Scope Loss, our new loss function, adjusts gradients to prevent performance losses from over-exploitation and dataset imbalances, without the need for any tuning. We test Scope Loss against SOTA loss functions over a basket of benchmark reinforcement learning tasks and a skewed classification dataset, and show that Scope Loss outperforms other loss functions.

* 11 pages, 2 figures, under review for NeurIPS 2023

Via

Access Paper or Ask Questions