Abstract:Reinforcement-learned locomotion enables legged robots to perform highly dynamic motions but often accompanies time-consuming manual tuning of joint stiffness. This paper introduces a novel control paradigm that integrates variable stiffness into the action space alongside joint positions, enabling grouped stiffness control such as per-joint stiffness (PJS), per-leg stiffness (PLS) and hybrid joint-leg stiffness (HJLS). We show that variable stiffness policies, with grouping in per-leg stiffness (PLS), outperform position-based control in velocity tracking and push recovery. In contrast, HJLS excels in energy efficiency. Furthermore, our method showcases robust walking behaviour on diverse outdoor terrains by sim-to-real transfer, although the policy is sorely trained on a flat floor. Our approach simplifies design by eliminating per-joint stiffness tuning while keeping competitive results with various metrics.
Abstract:Robotic manipulation remains a core challenge in robotics, particularly for contact-rich tasks such as industrial assembly and disassembly. Existing datasets have significantly advanced learning in manipulation but are primarily focused on simpler tasks like object rearrangement, falling short of capturing the complexity and physical dynamics involved in assembly and disassembly. To bridge this gap, we present REASSEMBLE (Robotic assEmbly disASSEMBLy datasEt), a new dataset designed specifically for contact-rich manipulation tasks. Built around the NIST Assembly Task Board 1 benchmark, REASSEMBLE includes four actions (pick, insert, remove, and place) involving 17 objects. The dataset contains 4,551 demonstrations, of which 4,035 were successful, spanning a total of 781 minutes. Our dataset features multi-modal sensor data including event cameras, force-torque sensors, microphones, and multi-view RGB cameras. This diverse dataset supports research in areas such as learning contact-rich manipulation, task condition identification, action segmentation, and more. We believe REASSEMBLE will be a valuable resource for advancing robotic manipulation in complex, real-world scenarios. The dataset is publicly available on our project website: https://dsliwowski1.github.io/REASSEMBLE_page.
Abstract:The introduction of robots into everyday scenarios necessitates algorithms capable of monitoring the execution of tasks. In this paper, we propose ConditionNET, an approach for learning the preconditions and effects of actions in a fully data-driven manner. We develop an efficient vision-language model and introduce additional optimization objectives during training to optimize for consistent feature representations. ConditionNET explicitly models the dependencies between actions, preconditions, and effects, leading to improved performance. We evaluate our model on two robotic datasets, one of which we collected for this paper, containing 406 successful and 138 failed teleoperated demonstrations of a Franka Emika Panda robot performing tasks like pouring and cleaning the counter. We show in our experiments that ConditionNET outperforms all baselines on both anomaly detection and phase prediction tasks. Furthermore, we implement an action monitoring system on a real robot to demonstrate the practical applicability of the learned preconditions and effects. Our results highlight the potential of ConditionNET for enhancing the reliability and adaptability of robots in real-world environments. The data is available on the project website: https://dsliwowski1.github.io/ConditionNET_page.
Abstract:This paper introduces a new approach to enhance the robustness of humanoid walking under strong perturbations, such as substantial pushes. Effective recovery from external disturbances requires bipedal robots to dynamically adjust their stepping strategies, including footstep positions and timing. Unlike most advanced walking controllers that restrict footstep locations to a predefined convex region, substantially limiting recoverable disturbances, our method leverages reinforcement learning to dynamically adjust the permissible footstep region, expanding it to a larger, effectively non-convex area and allowing cross-over stepping, which is crucial for counteracting large lateral pushes. Additionally, our method adapts footstep timing in real time to further extend the range of recoverable disturbances. Based on these adjustments, feasible footstep positions and DCM trajectory are planned by solving a QP. Finally, we employ a DCM controller and an inverse dynamics whole-body control framework to ensure the robot effectively follows the trajectory.
Abstract:As humanoid robots transition from labs to real-world environments, it is essential to democratize robot control for non-expert users. Recent human-robot imitation algorithms focus on following a reference human motion with high precision, but they are susceptible to the quality of the reference motion and require the human operator to simplify its movements to match the robot's capabilities. Instead, we consider that the robot should understand and adapt the reference motion to its own abilities, facilitating the operator's task. For that, we introduce a deep-learning model that anticipates the robot's performance when imitating a given reference. Then, our system can generate multiple references given a high-level task command, assign a score to each of them, and select the best reference to achieve the desired robot behavior. Our Self-AWare model (SAW) ranks potential robot behaviors based on various criteria, such as fall likelihood, adherence to the reference motion, and smoothness. We integrate advanced motion generation, robot control, and SAW in one unique system, ensuring optimal robot behavior for any task command. For instance, SAW can anticipate falls with 99.29% accuracy. For more information check our project page: https://evm7.github.io/Self-AWare
Abstract:This paper addresses the critical need for refining robot motions that, despite achieving a high visual similarity through human-to-humanoid retargeting methods, fall short of practical execution in the physical realm. Existing techniques in the graphics community often prioritize visual fidelity over physics-based feasibility, posing a significant challenge for deploying bipedal systems in practical applications. Our research introduces a constrained reinforcement learning algorithm to produce physics-based high-quality motion imitation onto legged humanoid robots that enhance motion resemblance while successfully following the reference human trajectory. We name our framework: I-CTRL. By reformulating the motion imitation problem as a constrained refinement over non-physics-based retargeted motions, our framework excels in motion imitation with simple and unique rewards that generalize across four robots. Moreover, our framework can follow large-scale motion datasets with a unique RL agent. The proposed approach signifies a crucial step forward in advancing the control of bipedal robots, emphasizing the importance of aligning visual and physical realism for successful motion imitation.
Abstract:This article introduces a framework for complex human-robot collaboration tasks, such as the co-manufacturing of furniture. For these tasks, it is essential to encode tasks from human demonstration and reproduce these skills in a compliant and safe manner. Therefore, two key components are addressed in this work: motion generation and shared autonomy. We propose a motion generator based on a time-invariant potential field, capable of encoding wrench profiles, complex and closed-loop trajectories, and additionally incorporates obstacle avoidance. Additionally, the paper addresses shared autonomy (SA) which enables synergetic collaboration between human operators and robots by dynamically allocating authority. Variable impedance control (VIC) and force control are employed, where impedance and wrench are adapted based on the human-robot autonomy factor derived from interaction forces. System passivity is ensured by an energy-tank based task passivation strategy. The framework's efficacy is validated through simulations and an experimental study employing a Franka Emika Research 3 robot.
Abstract:Integrating robots into populated environments is a complex challenge that requires an understanding of human social dynamics. In this work, we propose to model social motion forecasting in a shared human-robot representation space, which facilitates us to synthesize robot motions that interact with humans in social scenarios despite not observing any robot in the motion training. We develop a transformer-based architecture called ECHO, which operates in the aforementioned shared space to predict the future motions of the agents encountered in social scenarios. Contrary to prior works, we reformulate the social motion problem as the refinement of the predicted individual motions based on the surrounding agents, which facilitates the training while allowing for single-motion forecasting when only one human is in the scene. We evaluate our model in multi-person and human-robot motion forecasting tasks and obtain state-of-the-art performance by a large margin while being efficient and performing in real-time. Additionally, our qualitative results showcase the effectiveness of our approach in generating human-robot interaction behaviors that can be controlled via text commands.
Abstract:Robots are becoming increasingly integrated into our lives, assisting us in various tasks. To ensure effective collaboration between humans and robots, it is essential that they understand our intentions and anticipate our actions. In this paper, we propose a Human-Object Interaction (HOI) anticipation framework for collaborative robots. We propose an efficient and robust transformer-based model to detect and anticipate HOIs from videos. This enhanced anticipation empowers robots to proactively assist humans, resulting in more efficient and intuitive collaborations. Our model outperforms state-of-the-art results in HOI detection and anticipation in VidHOI dataset with an increase of 1.76% and 1.04% in mAP respectively while being 15.4 times faster. We showcase the effectiveness of our approach through experimental results in a real robot, demonstrating that the robot's ability to anticipate HOIs is key for better Human-Robot Interaction. More information can be found on our project webpage: https://evm7.github.io/HOI4ABOT_page/
Abstract:Recently, several approaches have attempted to combine motion generation and control in one loop to equip robots with reactive behaviors, that cannot be achieved with traditional time-indexed tracking controllers. These approaches however mainly focused on positions, neglecting the orientation part which can be crucial to many tasks e.g. screwing. In this work, we propose a control algorithm that adapts the robot's rotational motion and impedance in a closed-loop manner. Given a first-order Dynamical System representing an orientation motion plan and a desired rotational stiffness profile, our approach enables the robot to follow the reference motion with an interactive behavior specified by the desired stiffness, while always being aware of the current orientation, represented as a Unit Quaternion (UQ). We rely on the Lie algebra to formulate our algorithm, since unlike positions, UQ feature constraints that should be respected in the devised controller. We validate our proposed approach in multiple robot experiments, showcasing the ability of our controller to follow complex orientation profiles, react safely to perturbations, and fulfill physical interaction tasks.