Abstract:Air transportation is undergoing a rapid evolution globally with the introduction of Advanced Air Mobility (AAM) and with it comes novel challenges and opportunities for transforming aviation. As AAM operations introduce increasing heterogeneity in vehicle capabilities and density, increased levels of automation are likely necessary to achieve operational safety and efficiency goals. This paper focuses on one example where increased automation has been suggested. Autonomous operations will need contingency management systems that can monitor evolving risk across a span of interrelated (or interdependent) hazards and, if necessary, execute appropriate control interventions via supervised or automated decision making. Accommodating this complex environment may require automated functions (autonomy) that apply artificial intelligence (AI) techniques that can adapt and respond to a quickly changing environment. This paper explores the use of Deep Reinforcement Learning (DRL) which has shown promising performance in complex and high-dimensional environments where the objective can be constructed as a sequential decision-making problem. An extension of a prior formulation of the contingency management problem as a Markov Decision Process (MDP) is presented and uses a DRL framework to train agents that mitigate hazards present in the simulation environment. A comparison of these learning-based agents and classical techniques is presented in terms of their performance, verification difficulties, and development process.
Abstract:Advanced Air Mobility (AAM) is the next generation of air transportation that includes new entrants such as electric vertical takeoff and landing (eVTOL) aircraft, increasingly autonomous flight operations, and small UAS package delivery. With these new vehicles and operational concepts comes a desire to increase densities far beyond what occurs today in and around urban areas, to utilize new battery technology, and to move toward more autonomously-piloted aircraft. To achieve these goals, it becomes essential to introduce new safety management system capabilities that can rapidly assess risk as it evolves across a span of complex hazards and, if necessary, mitigate risk by executing appropriate contingencies via supervised or automated decision-making during flights. Recently, reinforcement learning has shown promise for real-time decision making across a wide variety of applications including contingency management. In this work, we formulate the contingency management problem as a Markov Decision Process (MDP) and integrate the contingency management MDP into the AAM-Gym simulation framework. This enables rapid prototyping of reinforcement learning algorithms and evaluation of existing systems, thus providing a community benchmark for future algorithm development. We report baseline statistical information for the environment and provide example performance metrics.
Abstract:Advanced Air Mobility (AAM) introduces a new, efficient mode of transportation with the use of vehicle autonomy and electrified aircraft to provide increasingly autonomous transportation between previously underserved markets. Safe and efficient navigation of low altitude aircraft through highly dense environments requires the integration of a multitude of complex observations, such as surveillance, knowledge of vehicle dynamics, and weather. The processing and reasoning on these observations pose challenges due to the various sources of uncertainty in the information while ensuring cooperation with a variable number of aircraft in the airspace. These challenges coupled with the requirement to make safety-critical decisions in real-time rule out the use of conventional separation assurance techniques. We present a decentralized reinforcement learning framework to provide autonomous self-separation capabilities within AAM corridors with the use of speed and vertical maneuvers. The problem is formulated as a Markov Decision Process and solved by developing a novel extension to the sample-efficient, off-policy soft actor-critic (SAC) algorithm. We introduce the use of attention networks for variable-length observation processing and a distributed computing architecture to achieve high training sample throughput as compared to existing approaches. A comprehensive numerical study shows that the proposed framework can ensure safe and efficient separation of aircraft in high density, dynamic environments with various sources of uncertainty.