Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jason J Corso

A Critical Investigation of Deep Reinforcement Learning for Navigation

Jan 04, 2019

Vikas Dhiman, Shurjo Banerjee, Brent Griffin, Jeffrey M Siskind, Jason J Corso

Figure 1 for A Critical Investigation of Deep Reinforcement Learning for Navigation

Figure 2 for A Critical Investigation of Deep Reinforcement Learning for Navigation

Figure 3 for A Critical Investigation of Deep Reinforcement Learning for Navigation

Figure 4 for A Critical Investigation of Deep Reinforcement Learning for Navigation

Abstract:The navigation problem is classically approached in two steps: an exploration step, where map-information about the environment is gathered; and an exploitation step, where this information is used to navigate efficiently. Deep reinforcement learning (DRL) algorithms, alternatively, approach the problem of navigation in an end-to-end fashion. Inspired by the classical approach, we ask whether DRL algorithms are able to inherently explore, gather and exploit map-information over the course of navigation. We build upon Mirowski et al. [2017] work and introduce a systematic suite of experiments that vary three parameters: the agent's starting location, the agent's target location, and the maze structure. We choose evaluation metrics that explicitly measure the algorithm's ability to gather and exploit map-information. Our experiments show that when trained and tested on the same maps, the algorithm successfully gathers and exploits map-information. However, when trained and tested on different sets of maps, the algorithm fails to transfer the ability to gather and exploit map-information to unseen maps. Furthermore, we find that when the goal location is randomized and the map is kept static, the algorithm is able to gather and exploit map-information but the exploitation is far from optimal. We open-source our experimental suite in the hopes that it serves as a framework for the comparison of future algorithms and leads to the discovery of robust alternatives to classical navigation methods.

Via

Access Paper or Ask Questions