Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Boyao Li

On Neural Networks as Infinite Tree-Structured Probabilistic Graphical Models

May 27, 2023

Boyao Li, Alexandar J. Thomson, Matthew M. Engelhard, David Page

Abstract:Deep neural networks (DNNs) lack the precise semantics and definitive probabilistic interpretation of probabilistic graphical models (PGMs). In this paper, we propose an innovative solution by constructing infinite tree-structured PGMs that correspond exactly to neural networks. Our research reveals that DNNs, during forward propagation, indeed perform approximations of PGM inference that are precise in this alternative PGM structure. Not only does our research complement existing studies that describe neural networks as kernel machines or infinite-sized Gaussian processes, it also elucidates a more direct approximation that DNNs make to exact inference in PGMs. Potential benefits include improved pedagogy and interpretation of DNNs, and algorithms that can merge the strengths of PGMs and DNNs.

Via

Access Paper or Ask Questions

ACDER: Augmented Curiosity-Driven Experience Replay

Nov 16, 2020

Boyao Li, Tao Lu, Jiayi Li, Ning Lu, Yinghao Cai, Shuo Wang

Figure 1 for ACDER: Augmented Curiosity-Driven Experience Replay

Figure 2 for ACDER: Augmented Curiosity-Driven Experience Replay

Figure 3 for ACDER: Augmented Curiosity-Driven Experience Replay

Figure 4 for ACDER: Augmented Curiosity-Driven Experience Replay

Abstract:Exploration in environments with sparse feedback remains a challenging research problem in reinforcement learning (RL). When the RL agent explores the environment randomly, it results in low exploration efficiency, especially in robotic manipulation tasks with high dimensional continuous state and action space. In this paper, we propose a novel method, called Augmented Curiosity-Driven Experience Replay (ACDER), which leverages (i) a new goal-oriented curiosity-driven exploration to encourage the agent to pursue novel and task-relevant states more purposefully and (ii) the dynamic initial states selection as an automatic exploratory curriculum to further improve the sample-efficiency. Our approach complements Hindsight Experience Replay (HER) by introducing a new way to pursue valuable states. Experiments conducted on four challenging robotic manipulation tasks with binary rewards, including Reach, Push, Pick&Place and Multi-step Push. The empirical results show that our proposed method significantly outperforms existing methods in the first three basic tasks and also achieves satisfactory performance in multi-step robotic task learning.

Via

Access Paper or Ask Questions

Hindsight Generative Adversarial Imitation Learning

Mar 19, 2019

Naijun Liu, Tao Lu, Yinghao Cai, Boyao Li, Shuo Wang

Figure 1 for Hindsight Generative Adversarial Imitation Learning

Figure 2 for Hindsight Generative Adversarial Imitation Learning

Figure 3 for Hindsight Generative Adversarial Imitation Learning

Figure 4 for Hindsight Generative Adversarial Imitation Learning

Abstract:Compared to reinforcement learning, imitation learning (IL) is a powerful paradigm for training agents to learn control policies efficiently from expert demonstrations. However, in most cases, obtaining demonstration data is costly and laborious, which poses a significant challenge in some scenarios. A promising alternative is to train agent learning skills via imitation learning without expert demonstrations, which, to some extent, would extremely expand imitation learning areas. To achieve such expectation, in this paper, we propose Hindsight Generative Adversarial Imitation Learning (HGAIL) algorithm, with the aim of achieving imitation learning satisfying no need of demonstrations. Combining hindsight idea with the generative adversarial imitation learning (GAIL) framework, we realize implementing imitation learning successfully in cases of expert demonstration data are not available. Experiments show that the proposed method can train policies showing comparable performance to current imitation learning methods. Further more, HGAIL essentially endows curriculum learning mechanism which is critical for learning policies.

Via

Access Paper or Ask Questions