Abstract:State-of-the-art reinforcement learning is now able to learn versatile locomotion, balancing and push-recovery capabilities for bipedal robots in simulation. Yet, the reality gap has mostly been overlooked and the simulated results hardly transfer to real hardware. Either it is unsuccessful in practice because the physics is over-simplified and hardware limitations are ignored, or regularity is not guaranteed and unexpected hazardous motions can occur. This paper presents a reinforcement learning framework capable of learning robust standing push recovery for bipedal robots with a smooth out-of-the-box transfer to reality, requiring only instantaneous proprioceptive observations. By combining original termination conditions and policy smoothness conditioning, we achieve stable learning, sim-to-real transfer and safety using a policy without memory nor observation history. Reward shaping is then used to give insights into how to keep balance. We demonstrate its performance in reality on the lower-limb medical exoskeleton Atalante.
Abstract:We present Tianshou, a highly modularized python library for deep reinforcement learning (DRL) that uses PyTorch as its backend. Tianshou aims to provide building blocks to replicate common RL experiments and has officially supported more than 15 classic algorithms succinctly. To facilitate related research and prove Tianshou's reliability, we release Tianshou's benchmark of MuJoCo environments, covering 9 classic algorithms and 9/13 Mujoco tasks with state-of-the-art performance. We open-sourced Tianshou at https://github.com/thu-ml/tianshou/, which has received over 3k stars and become one of the most popular PyTorch-based DRL libraries.
Abstract:Autonomous robots require online trajectory planning capability to operate in the real world. Efficient offline trajectory planning methods already exist, but are computationally demanding, preventing their use online. In this paper, we present a novel algorithm called Guided Trajectory Learning that learns a function approximation of solutions computed through trajectory optimization while ensuring accurate and reliable predictions. This function approximation is then used online to generate trajectories. This algorithm is designed to be easy to implement, and practical since it does not require massive computing power. It is readily applicable to any robotics systems and effortless to set up on real hardware since robust control strategies are usually already available. We demonstrate the performance of our algorithm on flat-foot walking with the self-balanced exoskeleton Atalante.
Abstract:This paper presents and experimentally demonstrates a novel framework for variable assistance on lower body exoskeletons, based upon safety-critical control methods. Existing work has shown that providing some freedom of movement around a nominal gait, instead of rigidly following it, accelerates the spinal learning process of people with a walking impediment when using a lower body exoskeleton. With this as motivation, we present a method to accurately control how much a subject is allowed to deviate from a given gait while ensuring robustness to patient perturbation. This method leverages controlled set invariance tools to force certain joints to remain inside predefined trajectory tubes in a minimally invasive way. The effectiveness of the method is demonstrated experimentally with able-bodied subjects and the Atalante lower body exoskeleton.
Abstract:This manuscript presents control of a high-DOF fully actuated lower-limb exoskeleton for paraplegic individuals. The key novelty is the ability for the user to walk without the use of crutches or other external means of stabilization. We harness the power of modern optimization techniques and supervised machine learning to develop a smooth feedback control policy that provides robust velocity regulation and perturbation rejection. Preliminary evaluation of the stability and robustness of the proposed approach is demonstrated through the Gazebo simulation environment. In addition, preliminary experimental results with (complete) paraplegic individuals are included for the previous version of the controller.