Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shunpeng Yang

Multi-Loco: Unifying Multi-Embodiment Legged Locomotion via Reinforcement Learning Augmented Diffusion

Jun 13, 2025

Shunpeng Yang, Zhen Fu, Zhefeng Cao, Guo Junde, Patrick Wensing, Wei Zhang, Hua Chen

Abstract:Generalizing locomotion policies across diverse legged robots with varying morphologies is a key challenge due to differences in observation/action dimensions and system dynamics. In this work, we propose Multi-Loco, a novel unified framework combining a morphology-agnostic generative diffusion model with a lightweight residual policy optimized via reinforcement learning (RL). The diffusion model captures morphology-invariant locomotion patterns from diverse cross-embodiment datasets, improving generalization and robustness. The residual policy is shared across all embodiments and refines the actions generated by the diffusion model, enhancing task-aware performance and robustness for real-world deployment. We evaluated our method with a rich library of four legged robots in both simulation and real-world experiments. Compared to a standard RL framework with PPO, our approach -- replacing the Gaussian policy with a diffusion model and residual term -- achieves a 10.35% average return improvement, with gains up to 13.57% in wheeled-biped locomotion tasks. These results highlight the benefits of cross-embodiment data and composite generative architectures in learning robust, generalized locomotion skills.

* 19 pages

Via

Access Paper or Ask Questions

Task-Space Riccati Feedback based Whole Body Control for Underactuated Legged Locomotion

Mar 31, 2024

Shunpeng Yang, Zejun Hong, Sen Li, Patrick Wensing, Wei Zhang, Hua Chen

Abstract:This manuscript primarily aims to enhance the performance of whole-body controllers(WBC) for underactuated legged locomotion. We introduce a systematic parameter design mechanism for the floating-base feedback control within the WBC. The proposed approach involves utilizing the linearized model of unactuated dynamics to formulate a Linear Quadratic Regulator(LQR) and solving a Riccati gain while accounting for potential physical constraints through a second-order approximation of the log-barrier function. And then the user-tuned feedback gain for the floating base task is replaced by a new one constructed from the solved Riccati gain. Extensive simulations conducted in MuJoCo with a point bipedal robot, as well as real-world experiments performed on a quadruped robot, demonstrate the effectiveness of the proposed method. In the different bipedal locomotion tasks, compared with the user-tuned method, the proposed approach is at least 12% better and up to 50% better at linear velocity tracking, and at least 7% better and up to 47% better at angular velocity tracking. In the quadruped experiment, linear velocity tracking is improved by at least 3% and angular velocity tracking is improved by at least 23% using the proposed method.

* 6 pages, submitted to IROS 2024

Via

Access Paper or Ask Questions

Template Model Inspired Task Space Learning for Robust Bipedal Locomotion

Sep 27, 2023

Guillermo A. Castillo, Bowen Weng, Shunpeng Yang, Wei Zhang, Ayonga Hereid

Abstract:This work presents a hierarchical framework for bipedal locomotion that combines a Reinforcement Learning (RL)-based high-level (HL) planner policy for the online generation of task space commands with a model-based low-level (LL) controller to track the desired task space trajectories. Different from traditional end-to-end learning approaches, our HL policy takes insights from the angular momentum-based linear inverted pendulum (ALIP) to carefully design the observation and action spaces of the Markov Decision Process (MDP). This simple yet effective design creates an insightful mapping between a low-dimensional state that effectively captures the complex dynamics of bipedal locomotion and a set of task space outputs that shape the walking gait of the robot. The HL policy is agnostic to the task space LL controller, which increases the flexibility of the design and generalization of the framework to other bipedal robots. This hierarchical design results in a learning-based framework with improved performance, data efficiency, and robustness compared with the ALIP model-based approach and state-of-the-art learning-based frameworks for bipedal locomotion. The proposed hierarchical controller is tested in three different robots, Rabbit, a five-link underactuated planar biped; Walker2D, a seven-link fully-actuated planar biped; and Digit, a 3D humanoid robot with 20 actuated joints. The trained policy naturally learns human-like locomotion behaviors and is able to effectively track a wide range of walking speeds while preserving the robustness and stability of the walking gait even under adversarial conditions.

* Accepted at 2023 International Conference on Intelligent Robots and Systems (IROS). Supplemental Video: https://youtu.be/YTjMgGka4Ig

Via

Access Paper or Ask Questions

Quadruped Capturability and Push Recovery via a Switched-Systems Characterization of Dynamic Balance

Feb 17, 2022

Hua Chen, Zejun Hong, Shunpeng Yang, Patrick M. Wensing, Wei Zhang

Figure 1 for Quadruped Capturability and Push Recovery via a Switched-Systems Characterization of Dynamic Balance

Figure 2 for Quadruped Capturability and Push Recovery via a Switched-Systems Characterization of Dynamic Balance

Figure 3 for Quadruped Capturability and Push Recovery via a Switched-Systems Characterization of Dynamic Balance

Figure 4 for Quadruped Capturability and Push Recovery via a Switched-Systems Characterization of Dynamic Balance

Abstract:This paper studies capturability and push recovery for quadrupedal locomotion. Despite the rich literature on capturability analysis and push recovery control for legged robots, existing tools are developed mainly for bipeds or humanoids. Distinct quadrupedal features such as point contacts and multiple swinging legs prevent direct application of these methods. To address this gap, we propose a switched systems model for quadruped dynamics, and instantiate the abstract viability concept for quadrupedal locomotion with a time-based gait. Capturability is characterized through a novel specification of dynamically balanced states that addresses the time-varying nature of quadrupedal locomotion and balance. A linear inverted pendulum (LIP) model is adopted to demonstrate the theory and show how the newly developed quadrupedal capturability can be used in motion planning for quadrupedal push recovery. We formulate and solve an explicit model predictive control (EMPC) problem whose optimal solution fully characterizes quadrupedal capturability with the LIP. Given this analysis, an optimization-based planning scheme is devised for determining footsteps and center of mass references during push recovery. To validate the effectiveness of the overall framework, we conduct numerous simulation and hardware experiments. Simulation results illustrate the necessity of considering dynamic balance for quadrupedal capturability, and the significant improvement in disturbance rejection with the proposed strategy. Experimental validations on a replica of the Mini Cheetah quadruped demonstrate an up to 100% improvement as compared with state-of-the-art.

Via

Access Paper or Ask Questions

Force-feedback based Whole-body Stabilizer for Position-Controlled Humanoid Robots

Aug 15, 2021

Shunpeng Yang, Hua Chen, Zhen Fu, Wei Zhang

Figure 1 for Force-feedback based Whole-body Stabilizer for Position-Controlled Humanoid Robots

Figure 2 for Force-feedback based Whole-body Stabilizer for Position-Controlled Humanoid Robots

Figure 3 for Force-feedback based Whole-body Stabilizer for Position-Controlled Humanoid Robots

Figure 4 for Force-feedback based Whole-body Stabilizer for Position-Controlled Humanoid Robots

Abstract:This paper studies stabilizer design for position-controlled humanoid robots. Stabilizers are an essential part for position-controlled humanoids, whose primary objective is to adjust the control input sent to the robot to assist the tracking controller to better follow the planned reference trajectory. To achieve this goal, this paper develops a novel force-feedback based whole-body stabilizer that fully exploits the six-dimensional force measurement information and the whole-body dynamics to improve tracking performance. Relying on rigorous analysis of whole-body dynamics of position-controlled humanoids under unknown contact, the developed stabilizer leverages quadratic-programming based technique that allows cooperative consideration of both the center-of-mass tracking and contact force tracking. The effectiveness of the proposed stabilizer is demonstrated on the UBTECH Walker robot in the MuJoCo simulator. Simulation validations show a significant improvement in various scenarios as compared to commonly adopted stabilizers based on the zero-moment-point feedback and the linear inverted pendulum model.

* IROS 2021, 8 pages

Via

Access Paper or Ask Questions