Luleå University of Technology
Abstract:In this article, we present a framework for deploying an aerial multi-agent system in large-scale subterranean environments with minimal infrastructure for supporting multi-agent operations. The multi-agent objective is to optimally and reactively allocate and execute inspection tasks in a mine, which are entered by a mine operator on-the-fly. The assignment of currently available tasks to the team of agents is accomplished through an auction-based system, where the agents bid for available tasks, which are used by a central auctioneer to optimally assigns tasks to agents. A mobile Wi-Fi mesh supports inter-agent communication and bi-directional communication between the agents and the task allocator, while the task execution is performed completely infrastructure-free. Given a task to be accomplished, a reliable and modular agent behavior is synthesized by generating behavior trees from a pool of agent capabilities, using a back-chaining approach. The auction system in the proposed framework is reactive and supports addition of new operator-specified tasks on-the-go, at any point through a user-friendly operator interface. The framework has been validated in a real underground mining environment using three aerial agents, with several inspection locations spread in an environment of almost 200 meters. The proposed framework can be utilized for missions involving rapid inspection, gas detection, distributed sensing and mapping etc. in a subterranean environment. The proposed framework and its field deployment contributes towards furthering reliable automation in large-scale subterranean environments to offload both routine and dangerous tasks from human operators to autonomous aerial robots.
Abstract:Loop closure detection in large-scale and long-term missions can be computationally demanding due to the need to identify, verify, and process numerous candidate pairs to establish edge connections for the pose graph optimization. Keyframe sampling mitigates this by reducing the number of frames stored and processed in the back-end system. In this article, we address the gap in optimized keyframe sampling for the combined problem of pose graph optimization and loop closure detection. Our Minimal Subset Approach (MSA) employs an optimization strategy with two key factors, redundancy minimization and information preservation, within a sliding window framework to efficiently reduce redundant keyframes, while preserving essential information. This method delivers comparable performance to baseline approaches, while enhancing scalability and reducing computational overhead. Finally, we evaluate MSA on relevant publicly available datasets, showcasing that it consistently performs across a wide range of environments, without requiring any manual parameter tuning.
Abstract:Collaborative multi-agent exploration of unknown environments is crucial for search and rescue operations. Effective real-world deployment must address challenges such as limited inter-agent communication and static and dynamic obstacles. This paper introduces a novel decentralized collaborative framework based on Reinforcement Learning to enhance multi-agent exploration in unknown environments. Our approach enables agents to decide their next action using an agent-centered field-of-view occupancy grid, and features extracted from $\text{A}^*$ algorithm-based trajectories to frontiers in the reconstructed global map. Furthermore, we propose a constrained communication scheme that enables agents to share their environmental knowledge efficiently, minimizing exploration redundancy. The decentralized nature of our framework ensures that each agent operates autonomously, while contributing to a collective exploration mission. Extensive simulations in Gymnasium and real-world experiments demonstrate the robustness and effectiveness of our system, while all the results highlight the benefits of combining autonomous exploration with inter-agent map sharing, advancing the development of scalable and resilient robotic exploration systems.
Abstract:This paper introduces a novel enhancement to the Decentralized Multi-Agent Reinforcement Learning (D-MARL) exploration by proposing communication-induced action space to improve the mapping efficiency of unknown environments using homogeneous agents. Efficient exploration of large environments relies heavily on inter-agent communication as real-world scenarios are often constrained by data transmission limits, such as signal latency and bandwidth. Our proposed method optimizes each agent's policy using the heterogeneous-agent proximal policy optimization algorithm, allowing agents to autonomously decide whether to communicate or to explore, that is whether to share the locally collected maps or continue the exploration. We propose and compare multiple novel reward functions that integrate inter-agent communication and exploration, enhance mapping efficiency and robustness, and minimize exploration overlap. This article presents a framework developed in ROS2 to evaluate and validate the investigated architecture. Specifically, four TurtleBot3 Burgers have been deployed in a Gazebo-designed environment filled with obstacles to evaluate the efficacy of the trained policies in mapping the exploration arena.
Abstract:In this article, we present the Layered Semantic Graphs (LSG), a novel actionable hierarchical scene graph, fully integrated with a multi-modal mission planner, the FLIE: A First-Look based Inspection and Exploration planner. The novelty of this work stems from aiming to address the task of maintaining an intuitive and multi-resolution scene representation, while simultaneously offering a tractable foundation for planning and scene understanding during an ongoing inspection mission of apriori unknown targets-of-interest in an unknown environment. The proposed LSG scheme is composed of locally nested hierarchical graphs, at multiple layers of abstraction, with the abstract concepts grounded on the functionality of the integrated FLIE planner. Furthermore, LSG encapsulates real-time semantic segmentation models that offer extraction and localization of desired semantic elements within the hierarchical representation. This extends the capability of the inspection planner, which can then leverage LSG to make an informed decision to inspect a particular semantic of interest. We also emphasize the hierarchical and semantic path-planning capabilities of LSG, which can extend inspection missions by improving situational awareness for human operators in an unknown environment. The validity of the proposed scheme is proven through extensive evaluations of the proposed architecture in simulations, as well as experimental field deployments on a Boston Dynamics Spot quadruped robot in urban outdoor environment settings.
Abstract:This article presents xFLIE, a fully integrated 3D hierarchical scene graph based autonomous inspection architecture. Specifically, we present a tightly-coupled solution of incremental 3D Layered Semantic Graphs (LSG) construction and real-time exploitation by a multi-modal autonomy, First-Look based Inspection and Exploration (FLIE) planner, to address the task of inspection of apriori unknown semantic targets of interest in unknown environments. This work aims to address the challenge of maintaining, in addition to or as an alternative to volumetric models, an intuitive scene representation during large-scale inspection missions. Through its contributions, the proposed architecture aims to provide a high-level multi-tiered abstract environment representation whilst simultaneously maintaining a tractable foundation for rapid and informed decision-making capable of enhancing inspection planning through scene understanding, what should it inspect ?, and reasoning, why should it inspect ?. The proposed LSG framework is designed to leverage the concept of nesting lower local graphs, at multiple layers of abstraction, with the abstract concepts grounded on the functionality of the integrated FLIE planner. Through intuitive scene representation, the proposed architecture offers an easily digestible environment model for human operators which helps to improve situational awareness and their understanding of the operating environment. We highlight the use-case benefits of hierarchical and semantic path-planning capability over LSG to address queries, by the integrated planner as well as the human operator. The validity of the proposed architecture is evaluated in large-scale simulated outdoor urban scenarios as well as being deployed onboard a Boston Dynamics Spot quadruped robot for extensive outdoor field experiments.
Abstract:Proactive collision avoidance measures are imperative in environments where humans and robots coexist. Moreover, the introduction of high quality legged robots into workplaces highlighted the crucial role of a robust, fully autonomous safety solution for robots to be viable in shared spaces or in co-existence with humans. This article establishes for the first time ever an innovative Detect-Track-and-Avoid Architecture (DTAA) to enhance safety and overall mission performance. The proposed novel architectyre has the merit ot integrating object detection using YOLOv8, utilizing Ultralytics embedded object tracking, and state estimation of tracked objects through Kalman filters. Moreover, a novel heuristic clustering is employed to facilitate active avoidance of multiple closely positioned objects with similar velocities, creating sets of unsafe spaces for the Nonlinear Model Predictive Controller (NMPC) to navigate around. The NMPC identifies the most hazardous unsafe space, considering not only their current positions but also their predicted future locations. In the sequel, the NMPC calculates maneuvers to guide the robot along a path planned by D$^{*}_{+}$ towards its intended destination, while maintaining a safe distance to all identified obstacles. The efficacy of the novelly suggested DTAA framework is being validated by Real-life experiments featuring a Boston Dynamics Spot robot that demonstrates the robot's capability to consistently maintain a safe distance from humans in dynamic subterranean, urban indoor, and outdoor environments.
Abstract:In this article, we address the problem of designing a scalable control architecture for a safe coordinated operation of a multi-agent system with aerial (UAVs) and ground robots (UGVs) in a confined task space. The proposed method uses Control Barrier Functions (CBFs) to impose constraints associated with (i) collision avoidance between agents, (ii) landing of UAVs on mobile UGVs, and (iii) task space restriction. Further, to account for the rapid increase in the number of constraints for a single agent with the increasing number of agents, the proposed architecture uses a centralized-decentralized Edge cluster, where a centralized node (Watcher) activates the relevant constraints, reducing the need for high onboard processing and network complexity. The distributed nodes run the controller locally to overcome latency and network issues. The proposed Edge architecture is experimentally validated using multiple aerial and ground robots in a confined environment performing a coordinated operation.
Abstract:The paper introduces a novel framework for safe and autonomous aerial physical interaction in industrial settings. It comprises two main components: a neural network-based target detection system enhanced with edge computing for reduced onboard computational load, and a control barrier function (CBF)-based controller for safe and precise maneuvering. The target detection system is trained on a dataset under challenging visual conditions and evaluated for accuracy across various unseen data with changing lighting conditions. Depth features are utilized for target pose estimation, with the entire detection framework offloaded into low-latency edge computing. The CBF-based controller enables the UAV to converge safely to the target for precise contact. Simulated evaluations of both the controller and target detection are presented, alongside an analysis of real-world detection performance.
Abstract:This paper introduces a novel compliant mechanism combining lightweight and energy dissipation for aerial physical interaction. Weighting 400~g at take-off, the mechanism is actuated in the forward body direction, enabling precise position control for force interaction and various other aerial manipulation tasks. The robotic arm, structured as a closed-loop kinematic chain, employs two deported servomotors. Each joint is actuated with a single tendon for active motion control in compression of the arm at the end-effector. Its elasto-mechanical design reduces weight and provides flexibility, allowing passive-compliant interactions without impacting the motors' integrity. Notably, the arm's damping can be adjusted based on the proposed inner frictional bulges. Experimental applications showcase the aerial system performance in both free-flight and physical interaction. The presented work may open safer applications for \ac{MAV} in real environments subject to perturbations during interaction.