Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Runxuan Jiang

Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks

Mar 06, 2024

Ziping Xu, Zifan Xu, Runxuan Jiang, Peter Stone, Ambuj Tewari

Figure 1 for Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks

Figure 2 for Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks

Figure 3 for Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks

Figure 4 for Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks

Abstract:Multitask Reinforcement Learning (MTRL) approaches have gained increasing attention for its wide applications in many important Reinforcement Learning (RL) tasks. However, while recent advancements in MTRL theory have focused on the improved statistical efficiency by assuming a shared structure across tasks, exploration--a crucial aspect of RL--has been largely overlooked. This paper addresses this gap by showing that when an agent is trained on a sufficiently diverse set of tasks, a generic policy-sharing algorithm with myopic exploration design like $\epsilon$-greedy that are inefficient in general can be sample-efficient for MTRL. To the best of our knowledge, this is the first theoretical demonstration of the "exploration benefits" of MTRL. It may also shed light on the enigmatic success of the wide applications of myopic exploration in practice. To validate the role of diversity, we conduct experiments on synthetic robotic control environments, where the diverse task set aligns with the task selection by automatic curriculum learning, which is empirically shown to improve sample-efficiency.

Via

Access Paper or Ask Questions

TorsionNet: A Reinforcement Learning Approach to Sequential Conformer Search

Jun 12, 2020

Tarun Gogineni, Ziping Xu, Exequiel Punzalan, Runxuan Jiang, Joshua Kammeraad, Ambuj Tewari, Paul Zimmerman

Figure 1 for TorsionNet: A Reinforcement Learning Approach to Sequential Conformer Search

Figure 2 for TorsionNet: A Reinforcement Learning Approach to Sequential Conformer Search

Figure 3 for TorsionNet: A Reinforcement Learning Approach to Sequential Conformer Search

Figure 4 for TorsionNet: A Reinforcement Learning Approach to Sequential Conformer Search

Abstract:Molecular geometry prediction of flexible molecules, or conformer search, is a long-standing challenge in computational chemistry. This task is of great importance for predicting structure-activity relationships for a wide variety of substances ranging from biomolecules to ubiquitous materials. Substantial computational resources are invested in Monte Carlo and Molecular Dynamics methods to generate diverse and representative conformer sets for medium to large molecules, which are yet intractable to chemoinformatic conformer search methods. We present TorsionNet, an efficient sequential conformer search technique based on reinforcement learning under the rigid rotor approximation. The model is trained via curriculum learning, whose theoretical benefit is explored in detail, to maximize a novel metric grounded in thermodynamics called the Gibbs Score. Our experimental results show that TorsionNet outperforms the highest scoring chemoinformatics method by 4x on large branched alkanes, and by several orders of magnitude on the previously unexplored biopolymer lignin, with applications in renewable energy.

Via

Access Paper or Ask Questions