Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stelian Coros

RAMBO: RL-augmented Model-based Optimal Control for Whole-body Loco-manipulation

Apr 09, 2025

Jin Cheng, Dongho Kang, Gabriele Fadini, Guanya Shi, Stelian Coros

Abstract:Loco-manipulation -- coordinated locomotion and physical interaction with objects -- remains a major challenge for legged robots due to the need for both accurate force interaction and robustness to unmodeled dynamics. While model-based controllers provide interpretable dynamics-level planning and optimization, they are limited by model inaccuracies and computational cost. In contrast, learning-based methods offer robustness while struggling with precise modulation of interaction forces. We introduce RAMBO -- RL-Augmented Model-Based Optimal Control -- a hybrid framework that integrates model-based reaction force optimization using a simplified dynamics model and a feedback policy trained with reinforcement learning. The model-based module generates feedforward torques by solving a quadratic program, while the policy provides feedback residuals to enhance robustness in control execution. We validate our framework on a quadruped robot across a diverse set of real-world loco-manipulation tasks -- such as pushing a shopping cart, balancing a plate, and holding soft objects -- in both quadrupedal and bipedal walking. Our experiments demonstrate that RAMBO enables precise manipulation while achieving robust and dynamic locomotion, surpassing the performance of policies trained with end-to-end scheme. In addition, our method enables flexible trade-off between end-effector tracking accuracy with compliance.

* 9 pages, 6 figures

Via

Access Paper or Ask Questions

A Benchmark for Optimal Multi-Modal Multi-Robot Multi-Goal Path Planning with Given Robot Assignment

Mar 05, 2025

Valentin N. Hartmann, Tirza Heinle, Stelian Coros

Abstract:In many industrial robotics applications, multiple robots are working in a shared workspace to complete a set of tasks as quickly as possible. Such settings can be treated as multi-modal multi-robot multi-goal path planning problems, where each robot has to reach an ordered sequence of goals. Existing approaches to this type of problem solve this using prioritization or assume synchronous completion of tasks, and are thus neither optimal nor complete. We formalize this problem as a single path planning problem and introduce a benchmark encompassing a diverse range of problem instances including scenarios with various robots, planning horizons, and collaborative tasks such as handovers. Along with the benchmark, we adapt an RRT* and a PRM* planner to serve as a baseline for the planning problems. Both planners work in the composite space of all robots and introduce the required changes to work in our setting. Unlike existing approaches, our planner and formulation is not restricted to discretized 2D workspaces, supports a changing environment, and works for heterogeneous robot teams over multiple modes with different constraints, and multiple goals. Videos and code for the benchmark and the planners is available at https://vhartman.github.io/mrmg-planning/.

* 8 pages, 8 figures

Via

Access Paper or Ask Questions

A comparison of visual representations for real-world reinforcement learning in the context of vacuum gripping

Mar 04, 2025

Nico Sutter, Valentin N. Hartmann, Stelian Coros

Abstract:When manipulating objects in the real world, we need reactive feedback policies that take into account sensor information to inform decisions. This study aims to determine how different encoders can be used in a reinforcement learning (RL) framework to interpret the spatial environment in the local surroundings of a robot arm. Our investigation focuses on comparing real-world vision with 3D scene inputs, exploring new architectures in the process. We built on the SERL framework, providing us with a sample efficient and stable RL foundation we could build upon, while keeping training times minimal. The results of this study indicate that spatial information helps to significantly outperform the visual counterpart, tested on a box picking task with a vacuum gripper. The code and videos of the evaluations are available at https://github.com/nisutte/voxel-serl.

* 8 pager, 5 Figures, 5 Tables

Via

Access Paper or Ask Questions

CAIMAN: Causal Action Influence Detection for Sample Efficient Loco-manipulation

Feb 02, 2025

Yuanchen Yuan, Jin Cheng, Núria Armengol Urpí, Stelian Coros

Abstract:Enabling legged robots to perform non-prehensile loco-manipulation with large and heavy objects is crucial for enhancing their versatility. However, this is a challenging task, often requiring sophisticated planning strategies or extensive task-specific reward shaping, especially in unstructured scenarios with obstacles. In this work, we present CAIMAN, a novel framework for learning loco-manipulation that relies solely on sparse task rewards. We leverage causal action influence to detect states where the robot is in control over other entities in the environment, and use this measure as an intrinsically motivated objective to enable sample-efficient learning. We employ a hierarchical control strategy, combining a low-level locomotion policy with a high-level policy that prioritizes task-relevant velocity commands. Through simulated and real-world experiments, including object manipulation with obstacles, we demonstrate the framework's superior sample efficiency, adaptability to diverse environments, and successful transfer to hardware without fine-tuning. The proposed approach paves the way for scalable, robust, and autonomous loco-manipulation in real-world applications.

Via

Access Paper or Ask Questions

Learning More With Less: Sample Efficient Dynamics Learning and Model-Based RL for Loco-Manipulation

Jan 17, 2025

Benjamin Hoffman, Jin Cheng, Chenhao Li, Stelian Coros

Figure 1 for Learning More With Less: Sample Efficient Dynamics Learning and Model-Based RL for Loco-Manipulation

Figure 2 for Learning More With Less: Sample Efficient Dynamics Learning and Model-Based RL for Loco-Manipulation

Figure 3 for Learning More With Less: Sample Efficient Dynamics Learning and Model-Based RL for Loco-Manipulation

Figure 4 for Learning More With Less: Sample Efficient Dynamics Learning and Model-Based RL for Loco-Manipulation

Abstract:Combining the agility of legged locomotion with the capabilities of manipulation, loco-manipulation platforms have the potential to perform complex tasks in real-world applications. To this end, state-of-the-art quadrupeds with attached manipulators, such as the Boston Dynamics Spot, have emerged to provide a capable and robust platform. However, both the complexity of loco-manipulation control, as well as the black-box nature of commercial platforms pose challenges for developing accurate dynamics models and control policies. We address these challenges by developing a hand-crafted kinematic model for a quadruped-with-arm platform and, together with recent advances in Bayesian Neural Network (BNN)-based dynamics learning using physical priors, efficiently learn an accurate dynamics model from data. We then derive control policies for loco-manipulation via model-based reinforcement learning (RL). We demonstrate the effectiveness of this approach on hardware using the Boston Dynamics Spot with a manipulator, accurately performing dynamic end-effector trajectory tracking even in low data regimes.

* Master Thesis at ETH Zurich

Via

Access Paper or Ask Questions

MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Dec 16, 2024

Bhavya Sukhija, Stelian Coros, Andreas Krause, Pieter Abbeel, Carmelo Sferrazza

Figure 1 for MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Figure 2 for MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Figure 3 for MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Figure 4 for MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Abstract:Reinforcement learning (RL) algorithms aim to balance exploiting the current best strategy with exploring new options that could lead to higher rewards. Most common RL algorithms use undirected exploration, i.e., select random sequences of actions. Exploration can also be directed using intrinsic rewards, such as curiosity or model epistemic uncertainty. However, effectively balancing task and intrinsic rewards is challenging and often task-dependent. In this work, we introduce a framework, MaxInfoRL, for balancing intrinsic and extrinsic exploration. MaxInfoRL steers exploration towards informative transitions, by maximizing intrinsic rewards such as the information gain about the underlying task. When combined with Boltzmann exploration, this approach naturally trades off maximization of the value function with that of the entropy over states, rewards, and actions. We show that our approach achieves sublinear regret in the simplified setting of multi-armed bandits. We then apply this general formulation to a variety of off-policy model-free RL methods for continuous state-action spaces, yielding novel algorithms that achieve superior performance across hard exploration problems and complex scenarios such as visual control tasks.

Via

Access Paper or Ask Questions

Multi-robot workspace design and motion planning for package sorting

Dec 15, 2024

Peiyu Zeng, Yijiang Huang, Simon Huber, Stelian Coros

Abstract:Robotic systems are routinely used in the logistics industry to enhance operational efficiency, but the design of robot workspaces remains a complex and manual task, which limits the system's flexibility to changing demands. This paper aims to automate robot workspace design by proposing a computational framework to generate a budget-minimizing layout by selectively placing stationary robots on a floor grid, which includes robotic arms and conveyor belts, and plan their cooperative motions to sort packages from given input and output locations. We propose a hierarchical solving strategy that first optimizes the layout to minimize the hardware budget with a subgraph optimization subject to network flow constraints, followed by task allocation and motion planning based on the generated layout. In addition, we demonstrate how to model conveyor belts as manipulators with multiple end effectors to integrate them into our design and planning framework. We evaluated our framework on a set of simulated scenarios and showed that it can generate optimal layouts and collision-free motion trajectories, adapting to different available robots, cost assignments, and box payloads.

* 9 pages, submitted to IEEE RA-L

Via

Access Paper or Ask Questions

Problem Space Transformations for Generalisation in Behavioural Cloning

Nov 06, 2024

Kiran Doshi, Marco Bagatella, Stelian Coros

Figure 1 for Problem Space Transformations for Generalisation in Behavioural Cloning

Figure 2 for Problem Space Transformations for Generalisation in Behavioural Cloning

Figure 3 for Problem Space Transformations for Generalisation in Behavioural Cloning

Figure 4 for Problem Space Transformations for Generalisation in Behavioural Cloning

Abstract:The combination of behavioural cloning and neural networks has driven significant progress in robotic manipulation. As these algorithms may require a large number of demonstrations for each task of interest, they remain fundamentally inefficient in complex scenarios. This issue is aggravated when the system is treated as a black-box, ignoring its physical properties. This work characterises widespread properties of robotic manipulation, such as pose equivariance and locality. We empirically demonstrate that transformations arising from each of these properties allow neural policies trained with behavioural cloning to better generalise to out-of-distribution problem instances.

Via

Access Paper or Ask Questions

ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning

Oct 12, 2024

Yarden As, Bhavya Sukhija, Lenart Treven, Carmelo Sferrazza, Stelian Coros, Andreas Krause

Figure 1 for ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning

Figure 2 for ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning

Figure 3 for ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning

Figure 4 for ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning

Abstract:Reinforcement learning (RL) is ubiquitous in the development of modern AI systems. However, state-of-the-art RL agents require extensive, and potentially unsafe, interactions with their environments to learn effectively. These limitations confine RL agents to simulated environments, hindering their ability to learn directly in real-world settings. In this work, we present ActSafe, a novel model-based RL algorithm for safe and efficient exploration. ActSafe learns a well-calibrated probabilistic model of the system and plans optimistically w.r.t. the epistemic uncertainty about the unknown dynamics, while enforcing pessimism w.r.t. the safety constraints. Under regularity assumptions on the constraints and dynamics, we show that ActSafe guarantees safety during learning while also obtaining a near-optimal policy in finite time. In addition, we propose a practical variant of ActSafe that builds on latest model-based RL advancements and enables safe exploration even in high-dimensional settings such as visual control. We empirically show that ActSafe obtains state-of-the-art performance in difficult exploration tasks on standard safe deep RL benchmarks while ensuring safety during learning.

Via

Access Paper or Ask Questions

PokeFlex: A Real-World Dataset of Deformable Objects for Robotics

Oct 10, 2024

Jan Obrist, Miguel Zamora, Hehui Zheng, Ronan Hinchet, Firat Ozdemir, Juan Zarate, Robert K. Katzschmann, Stelian Coros

Figure 1 for PokeFlex: A Real-World Dataset of Deformable Objects for Robotics

Figure 2 for PokeFlex: A Real-World Dataset of Deformable Objects for Robotics

Figure 3 for PokeFlex: A Real-World Dataset of Deformable Objects for Robotics

Figure 4 for PokeFlex: A Real-World Dataset of Deformable Objects for Robotics

Abstract:Data-driven methods have shown great potential in solving challenging manipulation tasks, however, their application in the domain of deformable objects has been constrained, in part, by the lack of data. To address this, we propose PokeFlex, a dataset featuring real-world paired and annotated multimodal data that includes 3D textured meshes, point clouds, RGB images, and depth maps. Such data can be leveraged for several downstream tasks such as online 3D mesh reconstruction, and it can potentially enable underexplored applications such as the real-world deployment of traditional control methods based on mesh simulations. To deal with the challenges posed by real-world 3D mesh reconstruction, we leverage a professional volumetric capture system that allows complete 360{\deg} reconstruction. PokeFlex consists of 18 deformable objects with varying stiffness and shapes. Deformations are generated by dropping objects onto a flat surface or by poking the objects with a robot arm. Interaction forces and torques are also reported for the latter case. Using different data modalities, we demonstrated a use case for the PokeFlex dataset in online 3D mesh reconstruction. We refer the reader to our website ( https://pokeflex-dataset.github.io/ ) for demos and examples of our dataset.

Via

Access Paper or Ask Questions