Abstract:Humanoid robots with behavioral autonomy have consistently been regarded as ideal collaborators in our daily lives and promising representations of embodied intelligence. Compared to fixed-based robotic arms, humanoid robots offer a larger operational space while significantly increasing the difficulty of control and planning. Despite the rapid progress towards general-purpose humanoid robots, most studies remain focused on locomotion ability with few investigations into whole-body coordination and tasks planning, thus limiting the potential to demonstrate long-horizon tasks involving both mobility and manipulation under open-ended verbal instructions. In this work, we propose a novel framework that learns, selects, and plans behaviors based on tasks in different scenarios. We combine reinforcement learning (RL) with whole-body optimization to generate robot motions and store them into a motion library. We further leverage the planning and reasoning features of the large language model (LLM), constructing a hierarchical task graph that comprises a series of motion primitives to bridge lower-level execution with higher-level planning. Experiments in simulation and real-world using the CENTAURO robot show that the language model based planner can efficiently adapt to new loco-manipulation tasks, demonstrating high autonomy from free-text commands in unstructured scenes.
Abstract:Enabling humanoid robots to perform autonomously loco-manipulation in unstructured environments is crucial and highly challenging for achieving embodied intelligence. This involves robots being able to plan their actions and behaviors in long-horizon tasks while using multi-modality to perceive deviations between task execution and high-level planning. Recently, large language models (LLMs) have demonstrated powerful planning and reasoning capabilities for comprehension and processing of semantic information through robot control tasks, as well as the usability of analytical judgment and decision-making for multi-modal inputs. To leverage the power of LLMs towards humanoid loco-manipulation, we propose a novel language-model based framework that enables robots to autonomously plan behaviors and low-level execution under given textual instructions, while observing and correcting failures that may occur during task execution. To systematically evaluate this framework in grounding LLMs, we created the robot 'action' and 'sensing' behavior library for task planning, and conducted mobile manipulation tasks and experiments in both simulated and real environments using the CENTAURO robot, and verified the effectiveness and application of this approach in robotic tasks with autonomous behavioral planning.
Abstract:Enabling robots to autonomously perform hybrid motions in diverse environments can be beneficial for long-horizon tasks such as material handling, household chores, and work assistance. This requires extensive exploitation of intrinsic motion capabilities, extraction of affordances from rich environmental information, and planning of physical interaction behaviors. Despite recent progress has demonstrated impressive humanoid whole-body control abilities, they struggle to achieve versatility and adaptability for new tasks. In this work, we propose HYPERmotion, a framework that learns, selects and plans behaviors based on tasks in different scenarios. We combine reinforcement learning with whole-body optimization to generate motion for 38 actuated joints and create a motion library to store the learned skills. We apply the planning and reasoning features of the large language models (LLMs) to complex loco-manipulation tasks, constructing a hierarchical task graph that comprises a series of primitive behaviors to bridge lower-level execution with higher-level planning. By leveraging the interaction of distilled spatial geometry and 2D observation with a visual language model (VLM) to ground knowledge into a robotic morphology selector to choose appropriate actions in single- or dual-arm, legged or wheeled locomotion. Experiments in simulation and real-world show that learned motions can efficiently adapt to new tasks, demonstrating high autonomy from free-text commands in unstructured scenes. Videos and website: hy-motion.github.io/
Abstract:Recent progress in legged locomotion has rendered quadruped manipulators a promising solution for performing tasks that require both mobility and manipulation (loco-manipulation). In the real world, task specifications and/or environment constraints may require the quadruped manipulator to be equipped with high redundancy as well as whole-body motion coordination capabilities. This work presents an experimental evaluation of a whole-body Model Predictive Control (MPC) framework achieving real-time performance on a dual-arm quadruped platform consisting of 37 actuated joints. To the best of our knowledge this is the legged manipulator with the highest number of joints to be controlled with real-time whole-body MPC so far. The computational efficiency of the MPC while considering the full robot kinematics and the centroidal dynamics model builds upon an open-source DDP-variant solver and a state-of-the-art optimal control problem formulation. Differently from previous works on quadruped manipulators, the MPC is directly interfaced with the low-level joint impedance controllers without the need of designing an instantaneous whole-body controller. The feasibility on the real hardware is showcased using the CENTAURO platform for the challenging task of picking a heavy object from the ground. Dynamic stepping (trotting) is also showcased for first time with this robot. The results highlight the potential of replanning with whole-body information in a predictive control loop.
Abstract:Re-grasp manipulation leverages on ergonomic tools to assist humans in accomplishing diverse tasks. In certain scenarios, humans often employ external forces to effortlessly and precisely re-grasp tools like a hammer. Previous development on controllers for in-grasp sliding motion using passive dynamic actions (e.g.,gravity) relies on apprehension of finger-object contact information, and requires customized design for individual objects with varied geometry and weight distribution. It limits their adaptability to diverse objects. In this paper, we propose an end-to-end sliding motion controller based on imitation learning (IL) that necessitates minimal prior knowledge of object mechanics, relying solely on object position information. To expedite training convergence, we utilize a data glove to collect expert data trajectories and train the policy through Generative Adversarial Imitation Learning (GAIL). Simulation results demonstrate the controller's versatility in performing in-hand sliding tasks with objects of varying friction coefficients, geometric shapes, and masses. By migrating to a physical system using visual position estimation, the controller demonstrated an average success rate of 86%, surpassing the baseline algorithm's success rate of 35% of Behavior Cloning(BC) and 20% of Proximal Policy Optimization (PPO).
Abstract:This paper presents a simplified model-based trajectory optimization (TO) formulation for motion planning on quadruped mobile manipulators that carry heavy payload of known mass. The proposed payload-aware formulation simultaneously plans locomotion, payload manipulation and considers both robot and payload model dynamics while remaining computationally efficient. At the presence of heavy payload, the approach exhibits reduced leg outstretching (thus increased manipulability) in kinematically demanding motions due to the contribution of payload manipulation in the optimization. The framework's computational efficiency and performance is validated through a number of simulation and experimental studies with the bi-manual quadruped CENTAURO robot carrying on its arms a payload that exceeds 15 % of its mass and traversing non-flat terrain.
Abstract:Human beings can make use of various reactive strategies, e.g. foot location adjustment and upper-body inclination, to keep balance while walking under dynamic disturbances. In this work, we propose a novel Nonlinear Model Predictive Control (NMPC) framework for versatile bipedal gait pattern generation, with the capabilities of footstep adjustment, Center of Mass (CoM) height variation and angular momentum adaptation. These features are realized by constraining the Zero Moment Point motion with considering the variable CoM height and angular momentum change of the Inverted Pendulum plus Flywheel Model. In addition, the NMPC framework also takes into account the constraints of footstep location, CoM vertical motion, upper-body inclination and joint torques, and is finally formulated as a quadratically constrained quadratic program. Therefore, it can be solved efficiently by Sequential Quadratic Programming. Using this unified framework, versatile walking pattern with exploiting time-varying CoM height trajectory and angular momentum changes can be generated based only on the terrain information input. Furthermore, the improved capability for balance recovery under external pushes has been demonstrated through simulation studies.