Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Surya Murthy

University of Illinois, Urbana-Champaign

A Reinforcement Learning Approach to Quiet and Safe UAM Traffic Management

Jan 15, 2025

Surya Murthy, John-Paul Clarke, Ufuk Topcu, Zhenyu Gao

Abstract:Urban air mobility (UAM) is a transformative system that operates various small aerial vehicles in urban environments to reshape urban transportation. However, integrating UAM into existing urban environments presents a variety of complex challenges. Recent analyses of UAM's operational constraints highlight aircraft noise and system safety as key hurdles to UAM system implementation. Future UAM air traffic management schemes must ensure that the system is both quiet and safe. We propose a multi-agent reinforcement learning approach to manage UAM traffic, aiming at both vertical separation assurance and noise mitigation. Through extensive training, the reinforcement learning agent learns to balance the two primary objectives by employing altitude adjustments in a multi-layer UAM network. The results reveal the tradeoffs among noise impact, traffic congestion, and separation. Overall, our findings demonstrate the potential of reinforcement learning in mitigating UAM's noise impact while maintaining safe separation using altitude adjustments

* AIAA SciTech 2025 Forum
* Paper presented at SciTech 2025

Via

Access Paper or Ask Questions

Autonomous Negotiation Using Comparison-Based Gradient Estimation

Aug 20, 2024

Surya Murthy, Mustafa O. Karabag, Ufuk Topcu

Figure 1 for Autonomous Negotiation Using Comparison-Based Gradient Estimation

Figure 2 for Autonomous Negotiation Using Comparison-Based Gradient Estimation

Figure 3 for Autonomous Negotiation Using Comparison-Based Gradient Estimation

Figure 4 for Autonomous Negotiation Using Comparison-Based Gradient Estimation

Abstract:Negotiation is useful for resolving conflicts in multi-agent systems. We explore autonomous negotiation in a setting where two self-interested rational agents sequentially trade items from a finite set of categories. Each agent has a utility function that depends on the amount of items it possesses in each category. The offering agent makes trade offers to improve its utility without knowing the responding agent's utility function, and the responding agent accepts offers that improve its utility. We present a comparison-based algorithm for the offering agent that generates offers through previous acceptance or rejection responses without extensive information sharing. The algorithm estimates the responding agent's gradient by leveraging the rationality assumption and rejected offers to prune the space of potential gradients. After the algorithm makes a finite number of consecutively rejected offers, the responding agent is at a near-optimal state, or the agents' preferences are closely aligned. Additionally, we facilitate negotiations with humans by representing natural language feedback as comparisons that can be integrated into the proposed algorithm. We compare the proposed algorithm against random search baselines in integer and fractional trading scenarios and show that it improves the societal benefit with fewer offers.

Via

Access Paper or Ask Questions

Conveying Autonomous Robot Capabilities through Contrasting Behaviour Summaries

Apr 01, 2023

Peter Du, Surya Murthy, Katherine Driggs-Campbell

Abstract:As advances in artificial intelligence enable increasingly capable learning-based autonomous agents, it becomes more challenging for human observers to efficiently construct a mental model of the agent's behaviour. In order to successfully deploy autonomous agents, humans should not only be able to understand the individual limitations of the agents but also have insight on how they compare against one another. To do so, we need effective methods for generating human interpretable agent behaviour summaries. Single agent behaviour summarization has been tackled in the past through methods that generate explanations for why an agent chose to pick a particular action at a single timestep. However, for complex tasks, a per-action explanation may not be able to convey an agents global strategy. As a result, researchers have looked towards multi-timestep summaries which can better help humans assess an agents overall capability. More recently, multi-step summaries have also been used for generating contrasting examples to evaluate multiple agents. However, past approaches have largely relied on unstructured search methods to generate summaries and require agents to have a discrete action space. In this paper we present an adaptive search method for efficiently generating contrasting behaviour summaries with support for continuous state and action spaces. We perform a user study to evaluate the effectiveness of the summaries for helping humans discern the superior autonomous agent for a given task. Our results indicate that adaptive search can efficiently identify informative contrasting scenarios that enable humans to accurately select the better performing agent with a limited observation time budget.

Via

Access Paper or Ask Questions

Scheduling for Urban Air Mobility using Safe Learning

Sep 28, 2022

Surya Murthy, Natasha A. Neogi, Suda Bharadwaj

Figure 1 for Scheduling for Urban Air Mobility using Safe Learning

Figure 2 for Scheduling for Urban Air Mobility using Safe Learning

Figure 3 for Scheduling for Urban Air Mobility using Safe Learning

Figure 4 for Scheduling for Urban Air Mobility using Safe Learning

Abstract:This work considers the scheduling problem for Urban Air Mobility (UAM) vehicles travelling between origin-destination pairs with both hard and soft trip deadlines. Each route is described by a discrete probability distribution over trip completion times (or delay) and over inter-arrival times of requests (or demand) for the route along with a fixed hard or soft deadline. Soft deadlines carry a cost that is incurred when the deadline is missed. An online, safe scheduler is developed that ensures that hard deadlines are never missed, and that average cost of missing soft deadlines is minimized. The system is modelled as a Markov Decision Process (MDP) and safe model-based learning is used to find the probabilistic distributions over route delays and demand. Monte Carlo Tree Search (MCTS) Earliest Deadline First (EDF) is used to safely explore the learned models in an online fashion and develop a near-optimal non-preemptive scheduling policy. These results are compared with Value Iteration (VI) and MCTS (Random) scheduling solutions.

* EPTCS 371, 2022, pp. 86-102
* In Proceedings FMAS2022 ASYDE2022, arXiv:2209.13181

Via

Access Paper or Ask Questions