Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sizhao Li

RALLY: Role-Adaptive LLM-Driven Yoked Navigation for Agentic UAV Swarms

Jul 02, 2025

Ziyao Wang, Rongpeng Li, Sizhao Li, Yuming Xiang, Haiping Wang, Zhifeng Zhao, Honggang Zhang

Abstract:Intelligent control of Unmanned Aerial Vehicles (UAVs) swarms has emerged as a critical research focus, and it typically requires the swarm to navigate effectively while avoiding obstacles and achieving continuous coverage over multiple mission targets. Although traditional Multi-Agent Reinforcement Learning (MARL) approaches offer dynamic adaptability, they are hindered by the semantic gap in numerical communication and the rigidity of homogeneous role structures, resulting in poor generalization and limited task scalability. Recent advances in Large Language Model (LLM)-based control frameworks demonstrate strong semantic reasoning capabilities by leveraging extensive prior knowledge. However, due to the lack of online learning and over-reliance on static priors, these works often struggle with effective exploration, leading to reduced individual potential and overall system performance. To address these limitations, we propose a Role-Adaptive LLM-Driven Yoked navigation algorithm RALLY. Specifically, we first develop an LLM-driven semantic decision framework that uses structured natural language for efficient semantic communication and collaborative reasoning. Afterward, we introduce a dynamic role-heterogeneity mechanism for adaptive role switching and personalized decision-making. Furthermore, we propose a Role-value Mixing Network (RMIX)-based assignment strategy that integrates LLM offline priors with MARL online policies to enable semi-offline training of role selection strategies. Experiments in the Multi-Agent Particle Environment (MPE) environment and a Software-In-The-Loop (SITL) platform demonstrate that RALLY outperforms conventional approaches in terms of task coverage, convergence speed, and generalization, highlighting its strong potential for collaborative navigation in agentic multi-UAV systems.

Via

Access Paper or Ask Questions

Imitation Learning based Alternative Multi-Agent Proximal Policy Optimization for Well-Formed Swarm-Oriented Pursuit Avoidance

Nov 06, 2023

Sizhao Li, Yuming Xiang, Rongpeng Li, Zhifeng Zhao, Honggang Zhang

Figure 1 for Imitation Learning based Alternative Multi-Agent Proximal Policy Optimization for Well-Formed Swarm-Oriented Pursuit Avoidance

Figure 2 for Imitation Learning based Alternative Multi-Agent Proximal Policy Optimization for Well-Formed Swarm-Oriented Pursuit Avoidance

Figure 3 for Imitation Learning based Alternative Multi-Agent Proximal Policy Optimization for Well-Formed Swarm-Oriented Pursuit Avoidance

Figure 4 for Imitation Learning based Alternative Multi-Agent Proximal Policy Optimization for Well-Formed Swarm-Oriented Pursuit Avoidance

Abstract:Multi-Robot System (MRS) has garnered widespread research interest and fostered tremendous interesting applications, especially in cooperative control fields. Yet little light has been shed on the compound ability of formation, monitoring and defence in decentralized large-scale MRS for pursuit avoidance, which puts stringent requirements on the capability of coordination and adaptability. In this paper, we put forward a decentralized Imitation learning based Alternative Multi-Agent Proximal Policy Optimization (IA-MAPPO) algorithm to provide a flexible and communication-economic solution to execute the pursuit avoidance task in well-formed swarm. In particular, a policy-distillation based MAPPO executor is firstly devised to capably accomplish and swiftly switch between multiple formations in a centralized manner. Furthermore, we utilize imitation learning to decentralize the formation controller, so as to reduce the communication overheads and enhance the scalability. Afterwards, alternative training is leveraged to compensate the performance loss incurred by decentralization. The simulation results validate the effectiveness of IA-MAPPO and extensive ablation experiments further show the performance comparable to a centralized solution with significant decrease in communication overheads.

* 6 pages, 5 figures, 2023 IEEE the 9th International Conference on Computer and Communications (ICCC)

Via

Access Paper or Ask Questions

Decentralized Adaptive Formation via Consensus-Oriented Multi-Agent Communication

Jul 23, 2023

Yuming Xiang, Sizhao Li, Rongpeng Li, Zhifeng Zhao, Honggang Zhang

Abstract:Adaptive multi-agent formation control, which requires the formation to flexibly adjust along with the quantity variations of agents in a decentralized manner, belongs to one of the most challenging issues in multi-agent systems, especially under communication-limited constraints. In this paper, we propose a novel Consensus-based Decentralized Adaptive Formation (Cons-DecAF) framework. Specifically, we develop a novel multi-agent reinforcement learning method, Consensus-oriented Multi-Agent Communication (ConsMAC), to enable agents to perceive global information and establish the consensus from local states by effectively aggregating neighbor messages. Afterwards, we leverage policy distillation to accomplish the adaptive formation adjustment. Meanwhile, instead of pre-assigning specific positions of agents, we employ a displacement-based formation by Hausdorff distance to significantly improve the formation efficiency. The experimental results through extensive simulations validate that the proposed method has achieved outstanding performance in terms of both speed and stability.

* 6 pages, 5 figures

Via

Access Paper or Ask Questions