Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!
Abstract:In this paper, we present a framework for learning quadruped navigation by integrating central pattern generators (CPGs), i.e. systems of coupled oscillators, into the deep reinforcement learning (DRL) framework. Through both exteroceptive and proprioceptive sensing, the agent learns to modulate the intrinsic oscillator setpoints (amplitude and frequency) and coordinate rhythmic behavior among different oscillators to track velocity commands while avoiding collisions with the environment. We compare different neural network architectures (i.e. memory-free and memory-enabled) which learn implicit interoscillator couplings, as well as varying the strength of the explicit coupling weights in the oscillator dynamics equations. We train our policies in simulation and perform a sim-to-real transfer to the Unitree Go1 quadruped, where we observe robust navigation in a variety of scenarios. Our results show that both memory-enabled policy representations and explicit interoscillator couplings are beneficial for a successful sim-to-real transfer for navigation tasks. Video results can be found at https://youtu.be/O_LX1oLZOe0.
* arXiv admin note: substantial text overlap with arXiv:2211.00458