Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Songyuan Zhang

Solving Multi-Agent Safe Optimal Control with Distributed Epigraph Form MARL

Apr 21, 2025

Songyuan Zhang, Oswin So, Mitchell Black, Zachary Serlin, Chuchu Fan

Abstract:Tasks for multi-robot systems often require the robots to collaborate and complete a team goal while maintaining safety. This problem is usually formalized as a constrained Markov decision process (CMDP), which targets minimizing a global cost and bringing the mean of constraint violation below a user-defined threshold. Inspired by real-world robotic applications, we define safety as zero constraint violation. While many safe multi-agent reinforcement learning (MARL) algorithms have been proposed to solve CMDPs, these algorithms suffer from unstable training in this setting. To tackle this, we use the epigraph form for constrained optimization to improve training stability and prove that the centralized epigraph form problem can be solved in a distributed fashion by each agent. This results in a novel centralized training distributed execution MARL algorithm named Def-MARL. Simulation experiments on 8 different tasks across 2 different simulators show that Def-MARL achieves the best overall performance, satisfies safety constraints, and maintains stable training. Real-world hardware experiments on Crazyflie quadcopters demonstrate the ability of Def-MARL to safely coordinate agents to complete complex collaborative tasks compared to other methods.

* 28 pages, 16 figures; Accepted by Robotics: Science and Systems 2025

Via

Access Paper or Ask Questions

Discrete GCBF Proximal Policy Optimization for Multi-agent Safe Optimal Control

Feb 05, 2025

Songyuan Zhang, Oswin So, Mitchell Black, Chuchu Fan

Figure 1 for Discrete GCBF Proximal Policy Optimization for Multi-agent Safe Optimal Control

Figure 2 for Discrete GCBF Proximal Policy Optimization for Multi-agent Safe Optimal Control

Figure 3 for Discrete GCBF Proximal Policy Optimization for Multi-agent Safe Optimal Control

Figure 4 for Discrete GCBF Proximal Policy Optimization for Multi-agent Safe Optimal Control

Abstract:Control policies that can achieve high task performance and satisfy safety constraints are desirable for any system, including multi-agent systems (MAS). One promising technique for ensuring the safety of MAS is distributed control barrier functions (CBF). However, it is difficult to design distributed CBF-based policies for MAS that can tackle unknown discrete-time dynamics, partial observability, changing neighborhoods, and input constraints, especially when a distributed high-performance nominal policy that can achieve the task is unavailable. To tackle these challenges, we propose DGPPO, a new framework that simultaneously learns both a discrete graph CBF which handles neighborhood changes and input constraints, and a distributed high-performance safe policy for MAS with unknown discrete-time dynamics. We empirically validate our claims on a suite of multi-agent tasks spanning three different simulation engines. The results suggest that, compared with existing methods, our DGPPO framework obtains policies that achieve high task performance (matching baselines that ignore the safety constraints), and high safety rates (matching the most conservative baselines), with a constant set of hyperparameters across all environments.

* 31 pages, 15 figures, accepted by the thirteenth International Conference on Learning Representations (ICLR 2025)

Via

Access Paper or Ask Questions

Large Language Models to the Rescue: Deadlock Resolution in Multi-Robot Systems

Apr 09, 2024

Kunal Garg, Jacob Arkin, Songyuan Zhang, Nicholas Roy, Chuchu Fan

Figure 1 for Large Language Models to the Rescue: Deadlock Resolution in Multi-Robot Systems

Figure 2 for Large Language Models to the Rescue: Deadlock Resolution in Multi-Robot Systems

Figure 3 for Large Language Models to the Rescue: Deadlock Resolution in Multi-Robot Systems

Figure 4 for Large Language Models to the Rescue: Deadlock Resolution in Multi-Robot Systems

Abstract:Multi-agent robotic systems are prone to deadlocks in an obstacle environment where the system can get stuck away from its desired location under a smooth low-level control policy. Without an external intervention, often in terms of a high-level command, it is not possible to guarantee that just a low-level control policy can resolve such deadlocks. Utilizing the generalizability and low data requirements of large language models (LLMs), this paper explores the possibility of using LLMs for deadlock resolution. We propose a hierarchical control framework where an LLM resolves deadlocks by assigning a leader and direction for the leader to move along. A graph neural network (GNN) based low-level distributed control policy executes the assigned plan. We systematically study various prompting techniques to improve LLM's performance in resolving deadlocks. In particular, as part of prompt engineering, we provide in-context examples for LLMs. We conducted extensive experiments on various multi-robot environments with up to 15 agents and 40 obstacles. Our results demonstrate that LLM-based high-level planners are effective in resolving deadlocks in MRS.

Via

Access Paper or Ask Questions

GCBF+: A Neural Graph Control Barrier Function Framework for Distributed Safe Multi-Agent Control

Jan 25, 2024

Songyuan Zhang, Oswin So, Kunal Garg, Chuchu Fan

Abstract:Distributed, scalable, and safe control of large-scale multi-agent systems (MAS) is a challenging problem. In this paper, we design a distributed framework for safe multi-agent control in large-scale environments with obstacles, where a large number of agents are required to maintain safety using only local information and reach their goal locations. We introduce a new class of certificates, termed graph control barrier function (GCBF), which are based on the well-established control barrier function (CBF) theory for safety guarantees and utilize a graph structure for scalable and generalizable distributed control of MAS. We develop a novel theoretical framework to prove the safety of an arbitrary-sized MAS with a single GCBF. We propose a new training framework GCBF+ that uses graph neural networks (GNNs) to parameterize a candidate GCBF and a distributed control policy. The proposed framework is distributed and is capable of directly taking point clouds from LiDAR, instead of actual state information, for real-world robotic applications. We illustrate the efficacy of the proposed method through various hardware experiments on a swarm of drones with objectives ranging from exchanging positions to docking on a moving target without collision. Additionally, we perform extensive numerical experiments, where the number and density of agents, as well as the number of obstacles, increase. Empirical results show that in complex environments with nonlinear agents (e.g., Crazyflie drones) GCBF+ outperforms the handcrafted CBF-based method with the best performance by up to 20% for relatively small-scale MAS for up to 256 agents, and leading reinforcement learning (RL) methods by up to 40% for MAS with 1024 agents. Furthermore, the proposed method does not compromise on the performance, in terms of goal reaching, for achieving high safety rates, which is a common trade-off in RL-based methods.

* 18 pages, 12 figures, submitted to IEEE T-RO. arXiv admin note: text overlap with arXiv:2311.13014

Via

Access Paper or Ask Questions

Learning Safe Control for Multi-Robot Systems: Methods, Verification, and Open Challenges

Nov 22, 2023

Kunal Garg, Songyuan Zhang, Oswin So, Charles Dawson, Chuchu Fan

Figure 1 for Learning Safe Control for Multi-Robot Systems: Methods, Verification, and Open Challenges

Figure 2 for Learning Safe Control for Multi-Robot Systems: Methods, Verification, and Open Challenges

Figure 3 for Learning Safe Control for Multi-Robot Systems: Methods, Verification, and Open Challenges

Figure 4 for Learning Safe Control for Multi-Robot Systems: Methods, Verification, and Open Challenges

Abstract:In this survey, we review the recent advances in control design methods for robotic multi-agent systems (MAS), focussing on learning-based methods with safety considerations. We start by reviewing various notions of safety and liveness properties, and modeling frameworks used for problem formulation of MAS. Then we provide a comprehensive review of learning-based methods for safe control design for multi-robot systems. We start with various types of shielding-based methods, such as safety certificates, predictive filters, and reachability tools. Then, we review the current state of control barrier certificate learning in both a centralized and distributed manner, followed by a comprehensive review of multi-agent reinforcement learning with a particular focus on safety. Next, we discuss the state-of-the-art verification tools for the correctness of learning-based methods. Based on the capabilities and the limitations of the state of the art methods in learning and verification for MAS, we identify various broad themes for open challenges: how to design methods that can achieve good performance along with safety guarantees; how to decompose single-agent based centralized methods for MAS; how to account for communication-related practical issues; and how to assess transfer of theoretical guarantees to practice.

* Submitted to Annual Reviews in Control

Via

Access Paper or Ask Questions

Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality

Oct 27, 2021

Songyuan Zhang, Zhangjie Cao, Dorsa Sadigh, Yanan Sui

Figure 1 for Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality

Figure 2 for Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality

Figure 3 for Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality

Figure 4 for Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality

Abstract:Most existing imitation learning approaches assume the demonstrations are drawn from experts who are optimal, but relaxing this assumption enables us to use a wider range of data. Standard imitation learning may learn a suboptimal policy from demonstrations with varying optimality. Prior works use confidence scores or rankings to capture beneficial information from demonstrations with varying optimality, but they suffer from many limitations, e.g., manually annotated confidence scores or high average optimality of demonstrations. In this paper, we propose a general framework to learn from demonstrations with varying optimality that jointly learns the confidence score and a well-performing policy. Our approach, Confidence-Aware Imitation Learning (CAIL) learns a well-performing policy from confidence-reweighted demonstrations, while using an outer loss to track the performance of our model and to learn the confidence. We provide theoretical guarantees on the convergence of CAIL and evaluate its performance in both simulated and real robot experiments. Our results show that CAIL significantly outperforms other imitation learning methods from demonstrations with varying optimality. We further show that even without access to any optimal demonstrations, CAIL can still learn a successful policy, and outperforms prior work.

* 18 pages, 4 figures, 3 tables. Published at Conference on Neural Information Processing Systems (NeurIPS) 2021

Via

Access Paper or Ask Questions