Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Giovanni Beltrame

Polytechnique Montreal

Guided by Guardrails: Control Barrier Functions as Safety Instructors for Robotic Learning

May 24, 2025

Maeva Guerrier, Karthik Soma, Hassan Fouad, Giovanni Beltrame

Figure 1 for Guided by Guardrails: Control Barrier Functions as Safety Instructors for Robotic Learning

Figure 2 for Guided by Guardrails: Control Barrier Functions as Safety Instructors for Robotic Learning

Figure 3 for Guided by Guardrails: Control Barrier Functions as Safety Instructors for Robotic Learning

Figure 4 for Guided by Guardrails: Control Barrier Functions as Safety Instructors for Robotic Learning

Abstract:Safety stands as the primary obstacle preventing the widespread adoption of learning-based robotic systems in our daily lives. While reinforcement learning (RL) shows promise as an effective robot learning paradigm, conventional RL frameworks often model safety by using single scalar negative rewards with immediate episode termination, failing to capture the temporal consequences of unsafe actions (e.g., sustained collision damage). In this work, we introduce a novel approach that simulates these temporal effects by applying continuous negative rewards without episode termination. Our experiments reveal that standard RL methods struggle with this model, as the accumulated negative values in unsafe zones create learning barriers. To address this challenge, we demonstrate how Control Barrier Functions (CBFs), with their proven safety guarantees, effectively help robots avoid catastrophic regions while enhancing learning outcomes. We present three CBF-based approaches, each integrating traditional RL methods with Control Barrier Functions, guiding the agent to learn safe behavior. Our empirical analysis, conducted in both simulated environments and real-world settings using a four-wheel differential drive robot, explores the possibilities of employing these approaches for safe robotic learning.

Via

Access Paper or Ask Questions

An Addendum to NeBula: Towards Extending TEAM CoSTAR's Solution to Larger Scale Environments

Apr 18, 2025

Ali Agha, Kyohei Otsu, Benjamin Morrell, David D. Fan, Sung-Kyun Kim, Muhammad Fadhil Ginting, Xianmei Lei, Jeffrey Edlund, Seyed Fakoorian, Amanda Bouman(+79 more)

Abstract:This paper presents an appendix to the original NeBula autonomy solution developed by the TEAM CoSTAR (Collaborative SubTerranean Autonomous Robots), participating in the DARPA Subterranean Challenge. Specifically, this paper presents extensions to NeBula's hardware, software, and algorithmic components that focus on increasing the range and scale of the exploration environment. From the algorithmic perspective, we discuss the following extensions to the original NeBula framework: (i) large-scale geometric and semantic environment mapping; (ii) an adaptive positioning system; (iii) probabilistic traversability analysis and local planning; (iv) large-scale POMDP-based global motion planning and exploration behavior; (v) large-scale networking and decentralized reasoning; (vi) communication-aware mission planning; and (vii) multi-modal ground-aerial exploration solutions. We demonstrate the application and deployment of the presented systems and solutions in various large-scale underground environments, including limestone mine exploration scenarios as well as deployment in the DARPA Subterranean challenge.

* IEEE Transactions on Field Robotics, vol. 1, pp. 476-526, 2024

Via

Access Paper or Ask Questions

GNN-based Decentralized Perception in Multirobot Systems for Predicting Worker Actions

Jan 08, 2025

Ali Imran, Giovanni Beltrame, David St-Onge

Figure 1 for GNN-based Decentralized Perception in Multirobot Systems for Predicting Worker Actions

Figure 2 for GNN-based Decentralized Perception in Multirobot Systems for Predicting Worker Actions

Figure 3 for GNN-based Decentralized Perception in Multirobot Systems for Predicting Worker Actions

Figure 4 for GNN-based Decentralized Perception in Multirobot Systems for Predicting Worker Actions

Abstract:In industrial environments, predicting human actions is essential for ensuring safe and effective collaboration between humans and robots. This paper introduces a perception framework that enables mobile robots to understand and share information about human actions in a decentralized way. The framework first allows each robot to build a spatial graph representing its surroundings, which it then shares with other robots. This shared spatial data is combined with temporal information to track human behavior over time. A swarm-inspired decision-making process is used to ensure all robots agree on a unified interpretation of the human's actions. Results show that adding more robots and incorporating longer time sequences improve prediction accuracy. Additionally, the consensus mechanism increases system resilience, making the multi-robot setup more reliable in dynamic industrial settings.

* Submitted to RA-L

Via

Access Paper or Ask Questions

Bridging Swarm Intelligence and Reinforcement Learning

Oct 23, 2024

Karthik Soma, Yann Bouteiller, Heiko Hamann, Giovanni Beltrame

Figure 1 for Bridging Swarm Intelligence and Reinforcement Learning

Figure 2 for Bridging Swarm Intelligence and Reinforcement Learning

Figure 3 for Bridging Swarm Intelligence and Reinforcement Learning

Figure 4 for Bridging Swarm Intelligence and Reinforcement Learning

Abstract:Swarm intelligence (SI) explores how large groups of simple individuals (e.g., insects, fish, birds) collaborate to produce complex behaviors, exemplifying that the whole is greater than the sum of its parts. A fundamental task in SI is Collective Decision-Making (CDM), where a group selects the best option among several alternatives, such as choosing an optimal foraging site. In this work, we demonstrate a theoretical and empirical equivalence between CDM and single-agent reinforcement learning (RL) in multi-armed bandit problems, utilizing concepts from opinion dynamics, evolutionary game theory, and RL. This equivalence bridges the gap between SI and RL and leads us to introduce a novel abstract RL update rule called Maynard-Cross Learning. Additionally, it provides a new population-based perspective on common RL practices like learning rate adjustment and batching. Our findings enable cross-disciplinary fertilization between RL and SI, allowing techniques from one field to enhance the understanding and methodologies of the other.

Via

Access Paper or Ask Questions

Evolution with Opponent-Learning Awareness

Oct 22, 2024

Yann Bouteiller, Karthik Soma, Giovanni Beltrame

Figure 1 for Evolution with Opponent-Learning Awareness

Figure 2 for Evolution with Opponent-Learning Awareness

Figure 3 for Evolution with Opponent-Learning Awareness

Figure 4 for Evolution with Opponent-Learning Awareness

Abstract:The universe involves many independent co-learning agents as an ever-evolving part of our observed environment. Yet, in practice, Multi-Agent Reinforcement Learning (MARL) applications are usually constrained to small, homogeneous populations and remain computationally intensive. In this paper, we study how large heterogeneous populations of learning agents evolve in normal-form games. We show how, under assumptions commonly made in the multi-armed bandit literature, Multi-Agent Policy Gradient closely resembles the Replicator Dynamic, and we further derive a fast, parallelizable implementation of Opponent-Learning Awareness tailored for evolutionary simulations. This enables us to simulate the evolution of very large populations made of heterogeneous co-learning agents, under both naive and advanced learning strategies. We demonstrate our approach in simulations of 200,000 agents, evolving in the classic games of Hawk-Dove, Stag-Hunt, and Rock-Paper-Scissors. Each game highlights distinct ways in which Opponent-Learning Awareness affects evolution.

* 12 pages, 10 figures

Via

Access Paper or Ask Questions

BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation

Oct 16, 2024

Haechan Mark Bong, Ricardo de Azambuja, Giovanni Beltrame

Figure 1 for BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation

Figure 2 for BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation

Figure 3 for BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation

Figure 4 for BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation

Abstract:Real-time aerial image segmentation plays an important role in the environmental perception of Uncrewed Aerial Vehicles (UAVs). We introduce BlabberSeg, an optimized Vision-Language Model built on CLIPSeg for on-board, real-time processing of aerial images by UAVs. BlabberSeg improves the efficiency of CLIPSeg by reusing prompt and model features, reducing computational overhead while achieving real-time open-vocabulary aerial segmentation. We validated BlabberSeg in a safe landing scenario using the Dynamic Open-Vocabulary Enhanced SafE-Landing with Intelligence (DOVESEI) framework, which uses visual servoing and open-vocabulary segmentation. BlabberSeg reduces computational costs significantly, with a speed increase of 927.41% (16.78 Hz) on a NVIDIA Jetson Orin AGX (64GB) compared with the original CLIPSeg (1.81Hz), achieving real-time aerial segmentation with negligible loss in accuracy (2.1% as the ratio of the correctly segmented area with respect to CLIPSeg). BlabberSeg's source code is open and available online.

Via

Access Paper or Ask Questions

Multi-Objective Risk Assessment Framework for Exploration Planning Using Terrain and Traversability Analysis

Oct 04, 2024

Riana Gagnon Souleiman, Vivek Shankar Varadharajan, Giovanni Beltrame

Figure 1 for Multi-Objective Risk Assessment Framework for Exploration Planning Using Terrain and Traversability Analysis

Figure 2 for Multi-Objective Risk Assessment Framework for Exploration Planning Using Terrain and Traversability Analysis

Figure 3 for Multi-Objective Risk Assessment Framework for Exploration Planning Using Terrain and Traversability Analysis

Figure 4 for Multi-Objective Risk Assessment Framework for Exploration Planning Using Terrain and Traversability Analysis

Abstract:Exploration of unknown, unstructured environments, such as in search and rescue, cave exploration, and planetary missions,presents significant challenges due to their unpredictable nature. This unpredictability can lead to inefficient path planning and potential mission failures. We propose a multi-objective risk assessment method for exploration planning in such unconstrained environments. Our approach dynamically adjusts the weight of various risk factors to prevent the robot from undertaking lethal actions too early in the mission. By gradually increasing the allowable risk as the mission progresses, our method enables more efficient exploration. We evaluate risk based on environmental terrain properties, including elevation, slope, roughness, and traversability, and account for factors like battery life, mission duration, and travel distance. Our method is validated through experiments in various subterranean simulated cave environments. The results demonstrate that our approach ensures consistent exploration without incurring lethal actions, while introducing minimal computational overhead to the planning process.

* 7 pages, 8 figures, submitted to ICRA 2025

Via

Access Paper or Ask Questions

Frequency-based View Selection in Gaussian Splatting Reconstruction

Sep 24, 2024

Monica M. Q. Li, Pierre-Yves Lajoie, Giovanni Beltrame

Figure 1 for Frequency-based View Selection in Gaussian Splatting Reconstruction

Figure 2 for Frequency-based View Selection in Gaussian Splatting Reconstruction

Figure 3 for Frequency-based View Selection in Gaussian Splatting Reconstruction

Figure 4 for Frequency-based View Selection in Gaussian Splatting Reconstruction

Abstract:Three-dimensional reconstruction is a fundamental problem in robotics perception. We examine the problem of active view selection to perform 3D Gaussian Splatting reconstructions with as few input images as possible. Although 3D Gaussian Splatting has made significant progress in image rendering and 3D reconstruction, the quality of the reconstruction is strongly impacted by the selection of 2D images and the estimation of camera poses through Structure-from-Motion (SfM) algorithms. Current methods to select views that rely on uncertainties from occlusions, depth ambiguities, or neural network predictions directly are insufficient to handle the issue and struggle to generalize to new scenes. By ranking the potential views in the frequency domain, we are able to effectively estimate the potential information gain of new viewpoints without ground truth data. By overcoming current constraints on model architecture and efficacy, our method achieves state-of-the-art results in view selection, demonstrating its potential for efficient image-based 3D reconstruction.

* 8 pages, 4 figures

Via

Access Paper or Ask Questions

Learning Multi-agent Multi-machine Tending by Mobile Robots

Aug 29, 2024

Abdalwhab Abdalwhab, Giovanni Beltrame, Samira Ebrahimi Kahou, David St-Onge

Figure 1 for Learning Multi-agent Multi-machine Tending by Mobile Robots

Figure 2 for Learning Multi-agent Multi-machine Tending by Mobile Robots

Figure 3 for Learning Multi-agent Multi-machine Tending by Mobile Robots

Figure 4 for Learning Multi-agent Multi-machine Tending by Mobile Robots

Abstract:Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborative robots can tackle that can also highly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm setup, whereas mobile robots can provide more flexibility and scalability. In this work, we introduce a multi-agent multi-machine tending learning framework by mobile robots based on Multi-agent Reinforcement Learning (MARL) techniques with the design of a suitable observation and reward. Moreover, an attention-based encoding mechanism is developed and integrated into Multi-agent Proximal Policy Optimization (MAPPO) algorithm to boost its performance for machine tending scenarios. Our model (AB-MAPPO) outperformed MAPPO in this new challenging scenario in terms of task success, safety, and resources utilization. Furthermore, we provided an extensive ablation study to support our various design decisions.

* 7 pages, 4 figures

Via

Access Paper or Ask Questions

Active Semantic Mapping and Pose Graph Spectral Analysis for Robot Exploration

Aug 27, 2024

Rongge Zhang, Haechan Mark Bong, Giovanni Beltrame

Figure 1 for Active Semantic Mapping and Pose Graph Spectral Analysis for Robot Exploration

Figure 2 for Active Semantic Mapping and Pose Graph Spectral Analysis for Robot Exploration

Figure 3 for Active Semantic Mapping and Pose Graph Spectral Analysis for Robot Exploration

Figure 4 for Active Semantic Mapping and Pose Graph Spectral Analysis for Robot Exploration

Abstract:Exploration in unknown and unstructured environments is a pivotal requirement for robotic applications. A robot's exploration behavior can be inherently affected by the performance of its Simultaneous Localization and Mapping (SLAM) subsystem, although SLAM and exploration are generally studied separately. In this paper, we formulate exploration as an active mapping problem and extend it with semantic information. We introduce a novel active metric-semantic SLAM approach, leveraging recent research advances in information theory and spectral graph theory: we combine semantic mutual information and the connectivity metrics of the underlying pose graph of the SLAM subsystem. We use the resulting utility function to evaluate different trajectories to select the most favorable strategy during exploration. Exploration and SLAM metrics are analyzed in experiments. Running our algorithm on the Habitat dataset, we show that, while maintaining efficiency close to the state-of-the-art exploration methods, our approach effectively increases the performance of metric-semantic SLAM with a 21% reduction in average map error and a 9% improvement in average semantic classification accuracy.

* 8 pages, 5 figures

Via

Access Paper or Ask Questions