Abstract:Deep reinforcement learning (DRL) has emerged as a promising solution to mastering explosive and versatile quadrupedal jumping skills. However, current DRL-based frameworks usually rely on well-defined reference trajectories, which are obtained by capturing animal motions or transferring experience from existing controllers. This work explores the possibility of learning dynamic jumping without imitating a reference trajectory. To this end, we incorporate a curriculum design into DRL so as to accomplish challenging tasks progressively. Starting from a vertical in-place jump, we then generalize the learned policy to forward and diagonal jumps and, finally, learn to jump across obstacles. Conditioned on the desired landing location, orientation, and obstacle dimensions, the proposed approach contributes to a wide range of jumping motions, including omnidirectional jumping and robust jumping, alleviating the effort to extract references in advance. Particularly, without constraints from the reference motion, a 90cm forward jump is achieved, exceeding previous records for similar robots reported in the existing literature. Additionally, continuous jumping on the soft grassy floor is accomplished, even when it is not encountered in the training stage. A supplementary video showing our results can be found at https://youtu.be/nRaMCrwU5X8 .
Abstract:Controlled execution of dynamic motions in quadrupedal robots, especially those with articulated soft bodies, presents a unique set of challenges that traditional methods struggle to address efficiently. In this study, we tackle these issues by relying on a simple yet effective two-stage learning framework to generate dynamic motions for quadrupedal robots. First, a gradient-free evolution strategy is employed to discover simply represented control policies, eliminating the need for a predefined reference motion. Then, we refine these policies using deep reinforcement learning. Our approach enables the acquisition of complex motions like pronking and back-flipping, effectively from scratch. Additionally, our method simplifies the traditionally labour-intensive task of reward shaping, boosting the efficiency of the learning process. Importantly, our framework proves particularly effective for articulated soft quadrupeds, whose inherent compliance and adaptability make them ideal for dynamic tasks but also introduce unique control challenges.
Abstract:Human beings can make use of various reactive strategies, e.g. foot location adjustment and upper-body inclination, to keep balance while walking under dynamic disturbances. In this work, we propose a novel Nonlinear Model Predictive Control (NMPC) framework for versatile bipedal gait pattern generation, with the capabilities of footstep adjustment, Center of Mass (CoM) height variation and angular momentum adaptation. These features are realized by constraining the Zero Moment Point motion with considering the variable CoM height and angular momentum change of the Inverted Pendulum plus Flywheel Model. In addition, the NMPC framework also takes into account the constraints of footstep location, CoM vertical motion, upper-body inclination and joint torques, and is finally formulated as a quadratically constrained quadratic program. Therefore, it can be solved efficiently by Sequential Quadratic Programming. Using this unified framework, versatile walking pattern with exploiting time-varying CoM height trajectory and angular momentum changes can be generated based only on the terrain information input. Furthermore, the improved capability for balance recovery under external pushes has been demonstrated through simulation studies.