Abstract:This work introduces a model-free reinforcement learning framework that enables various modes of motion (quadruped, tripod, or biped) and diverse tasks for legged robot locomotion. We employ a motion-style reward based on a relaxed logarithmic barrier function as a soft constraint, to bias the learning process toward the desired motion style, such as gait, foot clearance, joint position, or body height. The predefined gait cycle is encoded in a flexible manner, facilitating gait adjustments throughout the learning process. Extensive experiments demonstrate that KAIST HOUND, a 45 kg robotic system, can achieve biped, tripod, and quadruped locomotion using the proposed framework; quadrupedal capabilities include traversing uneven terrain, galloping at 4.67 m/s, and overcoming obstacles up to 58 cm (67 cm for HOUND2); bipedal capabilities include running at 3.6 m/s, carrying a 7.5 kg object, and ascending stairs-all performed without exteroceptive input.
Abstract:This paper presents a method for achieving high-speed running of a quadruped robot by considering the actuator torque-speed operating region in reinforcement learning. The physical properties and constraints of the actuator are included in the training process to reduce state transitions that are infeasible in the real world due to motor torque-speed limitations. The gait reward is designed to distribute motor torque evenly across all legs, contributing to more balanced power usage and mitigating performance bottlenecks due to single-motor saturation. Additionally, we designed a lightweight foot to enhance the robot's agility. We observed that applying the motor operating region as a constraint helps the policy network avoid infeasible areas during sampling. With the trained policy, KAIST Hound, a 45 kg quadruped robot, can run up to 6.5 m/s, which is the fastest speed among electric motor-based quadruped robots.
Abstract:This paper presents a contact-implicit model predictive control (MPC) framework for the real-time discovery of multi-contact motions, without predefined contact mode sequences or foothold positions. This approach utilizes the contact-implicit differential dynamic programming (DDP) framework, merging the hard contact model with a linear complementarity constraint. We propose the analytical gradient of the contact impulse based on relaxed complementarity constraints to further the exploration of a variety of contact modes. By leveraging a hard contact model-based simulation and computation of search direction through a smooth gradient, our methodology identifies dynamically feasible state trajectories, control inputs, and contact forces while simultaneously unveiling new contact mode sequences. However, the broadened scope of contact modes does not always ensure real-world applicability. Recognizing this, we implemented differentiable cost terms to guide foot trajectories and make gait patterns. Furthermore, to address the challenge of unstable initial roll-outs in an MPC setting, we employ the multiple shooting variant of DDP. The efficacy of the proposed framework is validated through simulations and real-world demonstrations using a 45 kg HOUND quadruped robot, performing various tasks in simulation and showcasing actual experiments involving a forward trot and a front-leg rearing motion.
Abstract:Control of legged robots is a challenging problem that has been investigated by different approaches, such as model-based control and learning algorithms. This work proposes a novel Imitating and Finetuning Model Predictive Control (IFM) framework to take the strengths of both approaches. Our framework first develops a conventional model predictive controller (MPC) using Differential Dynamic Programming and Raibert heuristic, which serves as an expert policy. Then we train a clone of the MPC using imitation learning to make the controller learnable. Finally, we leverage deep reinforcement learning with limited exploration for further finetuning the policy on more challenging terrains. By conducting comprehensive simulation and hardware experiments, we demonstrate that the proposed IFM framework can significantly improve the performance of the given MPC controller on rough, slippery, and conveyor terrains that require careful coordination of footsteps. We also showcase that IFM can efficiently produce more symmetric, periodic, and energy-efficient gaits compared to Vanilla RL with a minimal burden of reward shaping.
Abstract:In this work, a non-gaited framework for legged system locomotion is presented. The approach decouples the gait sequence optimization by considering the problem as a decision-making process. The redefined contact sequence problem is solved by utilizing a Monte Carlo Tree Search (MCTS) algorithm that exploits optimization-based simulations to evaluate the best search direction. The proposed scheme has proven to have a good trade-off between exploration and exploitation of the search space compared to the state-of-the-art Mixed-Integer Quadratic Programming (MIQP). The model predictive control (MPC) utilizes the gait generated by the MCTS to optimize the ground reaction forces and future footholds position. The simulation results, performed on a quadruped robot, showed that the proposed framework could generate known periodic gait and adapt the contact sequence to the encountered conditions, including external forces and terrain with unknown and variable properties. When tested on robots with different layouts, the system has also shown its reliability.
Abstract:This paper presents a novel Representation-Free Model Predictive Control (RF-MPC) framework for controlling various dynamic motions of a quadrupedal robot in three dimensional (3D) space. Our formulation directly represents the rotational dynamics using the rotation matrix, which liberates us from the issues associated with the use of Euler angles and quaternion as the orientation representations. With a variation-based linearization scheme and a carefully constructed cost function, the MPC control law is transcribed to the standard Quadratic Program (QP) form. The MPC controller can operate at real-time rates of 250 Hz on a quadruped robot. Experimental results including periodic quadrupedal gaits and a controlled backflip validate that our control strategy could stabilize dynamic motions that involve singularity in 3D maneuvers.
Abstract:This paper proposes a kinodynamic motion planning framework for multi-legged robot jumping based on the mixed-integer convex program (MICP), which simultaneously reasons about centroidal motion, contact points, wrench, and gait sequences. This method uniquely combines configuration space discretization and the construction of feasible wrench polytope (FWP) to encode kinematic constraints, actuator limit, friction cone constraint, and gait sequencing into a single MICP. The MICP could be efficiently solved to the global optimum by off-the-shelf numerical solvers and provide highly dynamic jumping motions without requiring initial guesses. Simulation and experimental results demonstrate that the proposed method could find novel and dexterous maneuvers that are directly deployable on the two-legged robot platform to traverse through challenging terrains.