Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

William Paivine

Distributed Learning of Decentralized Control Policies for Articulated Mobile Robots

Jan 24, 2019

Guillaume Sartoretti, William Paivine, Yunfei Shi, Yue Wu, Howie Choset

Figure 1 for Distributed Learning of Decentralized Control Policies for Articulated Mobile Robots

Figure 2 for Distributed Learning of Decentralized Control Policies for Articulated Mobile Robots

Figure 3 for Distributed Learning of Decentralized Control Policies for Articulated Mobile Robots

Figure 4 for Distributed Learning of Decentralized Control Policies for Articulated Mobile Robots

Abstract:State-of-the-art distributed algorithms for reinforcement learning rely on multiple independent agents, which simultaneously learn in parallel environments while asynchronously updating a common, shared policy. Moreover, decentralized control architectures (e.g., CPGs) can coordinate spatially distributed portions of an articulated robot to achieve system-level objectives. In this work, we investigate the relationship between distributed learning and decentralized control by learning decentralized control policies for the locomotion of articulated robots in challenging environments. To this end, we present an approach that leverages the structure of the asynchronous advantage actor-critic (A3C) algorithm to provide a natural means of learning decentralized control policies on a single articulated robot. Our primary contribution shows individual agents in the A3C algorithm can be defined by independently controlled portions of the robot's body, thus enabling distributed learning on a single robot for efficient hardware implementation. We present results of closed-loop locomotion in unstructured terrains on a snake and a hexapod robot, using decentralized controllers learned offline and online respectively. Preprint of the paper submitted to the IEEE Transactions in Robotics (T-RO) journal in October 2018, and conditionally accepted for publication as a regular paper in January 2019.

* Copyright 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Via

Access Paper or Ask Questions

Learning to Sequence Robot Behaviors for Visual Navigation

Mar 26, 2018

Hadi Salman, Puneet Singhal, Tanmay Shankar, Peng Yin, Ali Salman, William Paivine, Guillaume Sartoretti, Matthew Travers, Howie Choset

Figure 1 for Learning to Sequence Robot Behaviors for Visual Navigation

Figure 2 for Learning to Sequence Robot Behaviors for Visual Navigation

Figure 3 for Learning to Sequence Robot Behaviors for Visual Navigation

Figure 4 for Learning to Sequence Robot Behaviors for Visual Navigation

Abstract:Recent literature in the robotics community has focused on learning robot behaviors that abstract out lower-level details of robot control. To fully leverage the efficacy of such behaviors, it is necessary to select and sequence them to achieve a given task. In this paper, we present an approach to both learn and sequence robot behaviors, applied to the problem of visual navigation of mobile robots. We construct a layered representation of control policies composed of low- level behaviors and a meta-level policy. The low-level behaviors enable the robot to locomote in a particular environment while avoiding obstacles, and the meta-level policy actively selects the low-level behavior most appropriate for the current situation based purely on visual feedback. We demonstrate the effectiveness of our method on three simulated robot navigation tasks: a legged hexapod robot which must successfully traverse varying terrain, a wheeled robot which must navigate a maze-like course while avoiding obstacles, and finally a wheeled robot navigating in the presence of dynamic obstacles. We show that by learning control policies in a layered manner, we gain the ability to successfully traverse new compound environments composed of distinct sub-environments, and outperform both the low-level behaviors in their respective sub-environments, as well as a hand-crafted selection of low-level policies on these compound environments.

Via

Access Paper or Ask Questions