Abstract:Unlabeled motion planning involves assigning a set of robots to target locations while ensuring collision avoidance, aiming to minimize the total distance traveled. The problem forms an essential building block for multi-robot systems in applications such as exploration, surveillance, and transportation. We address this problem in a decentralized setting where each robot knows only the positions of its $k$-nearest robots and $k$-nearest targets. This scenario combines elements of combinatorial assignment and continuous-space motion planning, posing significant scalability challenges for traditional centralized approaches. To overcome these challenges, we propose a decentralized policy learned via a Graph Neural Network (GNN). The GNN enables robots to determine (1) what information to communicate to neighbors and (2) how to integrate received information with local observations for decision-making. We train the GNN using imitation learning with the centralized Hungarian algorithm as the expert policy, and further fine-tune it using reinforcement learning to avoid collisions and enhance performance. Extensive empirical evaluations demonstrate the scalability and effectiveness of our approach. The GNN policy trained on 100 robots generalizes to scenarios with up to 500 robots, outperforming state-of-the-art solutions by 8.6\% on average and significantly surpassing greedy decentralized methods. This work lays the foundation for solving multi-robot coordination problems in settings where scalability is important.
Abstract:Coverage control is the problem of navigating a robot swarm to collaboratively monitor features or a phenomenon of interest not known a priori. The problem is challenging in decentralized settings with robots that have limited communication and sensing capabilities. We propose a learnable Perception-Action-Communication (LPAC) architecture for the problem, wherein a convolution neural network (CNN) processes localized perception; a graph neural network (GNN) facilitates robot communications; finally, a shallow multi-layer perceptron (MLP) computes robot actions. The GNN enables collaboration in the robot swarm by computing what information to communicate with nearby robots and how to incorporate received information. Evaluations show that the LPAC models -- trained using imitation learning -- outperform standard decentralized and centralized coverage control algorithms. The learned policy generalizes to environments different from the training dataset, transfers to larger environments with more robots, and is robust to noisy position estimates. The results indicate the suitability of LPAC architectures for decentralized navigation in robot swarms to achieve collaborative behavior.
Abstract:Collaboration in large robot swarms to achieve a common global objective is a challenging problem in large environments due to limited sensing and communication capabilities. The robots must execute a Perception-Action-Communication (PAC) loop -- they perceive their local environment, communicate with other robots, and take actions in real time. A fundamental challenge in decentralized PAC systems is to decide what information to communicate with the neighboring robots and how to take actions while utilizing the information shared by the neighbors. Recently, this has been addressed using Graph Neural Networks (GNNs) for applications such as flocking and coverage control. Although conceptually, GNN policies are fully decentralized, the evaluation and deployment of such policies have primarily remained centralized or restrictively decentralized. Furthermore, existing frameworks assume sequential execution of perception and action inference, which is very restrictive in real-world applications. This paper proposes a framework for asynchronous PAC in robot swarms, where decentralized GNNs are used to compute navigation actions and generate messages for communication. In particular, we use aggregated GNNs, which enable the exchange of hidden layer information between robots for computational efficiency and decentralized inference of actions. Furthermore, the modules in the framework are asynchronous, allowing robots to perform sensing, extracting information, communication, action inference, and control execution at different frequencies. We demonstrate the effectiveness of GNNs executed in the proposed framework in navigating large robot swarms for collaborative coverage of large environments.
Abstract:Active target sensing is the task of discovering and classifying an unknown number of targets in an environment and is critical in search-and-rescue missions. This paper develops a deep reinforcement learning approach to plan informative trajectories that increase the likelihood for an uncrewed aerial vehicle (UAV) to discover missing targets. Our approach efficiently (1) explores the environment to discover new targets, (2) exploits its current belief of the target states and incorporates inaccurate sensor models for high-fidelity classification, and (3) generates dynamically feasible trajectories for an agile UAV by employing a motion primitive library. Extensive simulations on randomly generated environments show that our approach is more efficient in discovering and classifying targets than several other baselines. A unique characteristic of our approach, in contrast to heuristic informative path planning approaches, is that it is robust to varying amounts of deviations of the prior belief from the true target distribution, thereby alleviating the challenge of designing heuristics specific to the application conditions.
Abstract:We propose a new formulation for the multi-robot task planning and allocation problem that incorporates (a) precedence relationships between tasks; (b) coordination for tasks allowing multiple robots to achieve increased efficiency; and (c) cooperation through the formation of robot coalitions for tasks that cannot be performed by individual robots alone. In our formulation, the tasks and the relationships between the tasks are specified by a task graph. We define a set of reward functions over the task graph's nodes and edges. These functions model the effect of robot coalition size on the task performance, and incorporate the influence of one task's performance on a dependent task. Solving this problem optimally is NP-hard. However, using the task graph formulation allows us to leverage min-cost network flow approaches to obtain approximate solutions efficiently. Additionally, we explore a mixed integer programming approach, which gives optimal solutions for small instances of the problem but is computationally expensive. We also develop a greedy heuristic algorithm as a baseline. Our modeling and solution approaches result in task plans that leverage task precedence relationships and robot coordination and cooperation to achieve high mission performance, even in large missions with many agents.
Abstract:The area coverage problem is the task of efficiently servicing a given two-dimensional surface using sensors mounted on robots such as unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs). We present a novel formulation for generating coverage routes for multiple capacity-constrained robots, where capacity can be specified in terms of battery life or flight time. Traversing the environment incurs demands on the robot resources, which have capacity limits. The central aspect of our approach is transforming the area coverage problem into a line coverage problem (i.e., coverage of linear features), and then generating routes that minimize the total cost of travel while respecting the capacity constraints. We define two modes of travel: (1) servicing and (2) deadheading, which correspond to whether a robot is performing task-specific actions or not. Our formulation allows separate and asymmetric travel costs and demands for the two modes. Furthermore, the cells computed from cell decomposition, aimed at minimizing the number of turns, are not required to be monotone polygons. We develop new procedures for cell decomposition and generation of service tracks that can handle non-monotone polygons with or without holes. We establish the efficacy of our algorithm on a ground robot dataset with 25 indoor environments and an aerial robot dataset with 300 outdoor environments. The algorithm generates solutions whose costs are 10% lower on average than state-of-the-art methods. We additionally demonstrate our algorithm in experiments with UAVs.
Abstract:Line coverage is the task of servicing a given set of one-dimensional features in an environment. It is important for the inspection of linear infrastructure such as road networks, power lines, and oil and gas pipelines. This paper addresses the single robot line coverage problem for aerial and ground robots by modeling it as an optimization problem on a graph. The problem belongs to the broad class of arc routing problems and is closely related to the asymmetric rural postman problem (RPP). The paper presents an integer linear programming formulation with proof of correctness. Using the minimum cost flow problem, we develop approximation algorithms with guarantees on the solution quality. These guarantees also improve the existing results for the asymmetric RPP. The main algorithm partitions the problem into three cases based on the structure of the required graph, i.e., the graph induced by the features that require servicing. We evaluate our algorithms on road networks from the 50 most populous cities in the world. The algorithms, augmented with improvement heuristics, run within 3s and generate solutions that are within 10% of the optimum. We experimentally demonstrate our algorithms with commercial UAVs on the UNC Charlotte campus road network.
Abstract:The line coverage problem is to find efficient routes for coverage of linear features by one or more resource-constrained robots. Linear features model environments such as road networks, power lines, and oil and gas pipelines. We define two modes of travel for the robots: servicing and deadheading. A robot services a feature if it performs task-specific actions, e.g., taking images, as it traverses the feature; otherwise, it is deadheading. Traversing the environment incurs costs (e.g., travel time) and demands on resources (e.g., battery life). Servicing and deadheading can have different cost and demand functions, and we further permit them to be direction-dependent. We model the environment as a graph and provide an integer linear program. As the problem is NP-hard, we develop a fast and efficient heuristic algorithm, Merge-Embed-Merge (MEM). The constructive property of the algorithm enables solving the multi-depot version for large graphs. We further extend the MEM algorithm to handle turning costs and nonholonomic constraints. We benchmark the algorithm on a dataset of 50 road networks and demonstrate the algorithm in experiments using aerial robots on road networks.
Abstract:This paper introduces the correlated arc orienteering problem (CAOP), where the task is to find routes for a team of robots to maximize the collection of rewards associated with features in the environment. These features can be one-dimensional or points in the environment, and can have spatial correlation, i.e., visiting a feature in the environment may provide a portion of the reward associated with a correlated feature. A robot incurs costs as it traverses the environment, and the total cost for its route is limited by a resource constraint such as battery life or operation time. As environments are often large, we permit multiple depots where the robots must start and end their routes. The CAOP generalizes the correlated orienteering problem (COP), where the rewards are only associated with point features, and the arc orienteering problem (AOP), where the rewards are not spatially correlated. We formulate a mixed integer quadratic program (MIQP) that formalizes the problem and gives optimal solutions. However, the problem is NP-hard, and therefore we develop an efficient greedy constructive algorithm. We illustrate the problem with two different applications: informative path planning for methane gas leak detection and coverage of road networks.
Abstract:Simultaneous localization and Planning (SLAP) is a crucial ability for an autonomous robot operating under uncertainty. In its most general form, SLAP induces a continuous POMDP (partially-observable Markov decision process), which needs to be repeatedly solved online. This paper addresses this problem and proposes a dynamic replanning scheme in belief space. The underlying POMDP, which is continuous in state, action, and observation space, is approximated offline via sampling-based methods, but operates in a replanning loop online to admit local improvements to the coarse offline policy. This construct enables the proposed method to combat changing environments and large localization errors, even when the change alters the homotopy class of the optimal trajectory. It further outperforms the state-of-the-art FIRM (Feedback-based Information RoadMap) method by eliminating unnecessary stabilization steps. Applying belief space planning to physical systems brings with it a plethora of challenges. A key focus of this paper is to implement the proposed planner on a physical robot and show the SLAP solution performance under uncertainty, in changing environments and in the presence of large disturbances, such as a kidnapped robot situation.