Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ritchie Lee

Adaptive Stress Testing of Trajectory Predictions in Flight Management Systems

Nov 04, 2020

Robert J. Moss, Ritchie Lee, Nicholas Visser, Joachim Hochwarth, James G. Lopez, Mykel J. Kochenderfer

Figure 1 for Adaptive Stress Testing of Trajectory Predictions in Flight Management Systems

Figure 2 for Adaptive Stress Testing of Trajectory Predictions in Flight Management Systems

Figure 3 for Adaptive Stress Testing of Trajectory Predictions in Flight Management Systems

Figure 4 for Adaptive Stress Testing of Trajectory Predictions in Flight Management Systems

Abstract:To find failure events and their likelihoods in flight-critical systems, we investigate the use of an advanced black-box stress testing approach called adaptive stress testing. We analyze a trajectory predictor from a developmental commercial flight management system which takes as input a collection of lateral waypoints and en-route environmental conditions. Our aim is to search for failure events relating to inconsistencies in the predicted lateral trajectories. The intention of this work is to find likely failures and report them back to the developers so they can address and potentially resolve shortcomings of the system before deployment. To improve search performance, this work extends the adaptive stress testing formulation to be applied more generally to sequential decision-making problems with episodic reward by collecting the state transitions during the search and evaluating at the end of the simulated rollout. We use a modified Monte Carlo tree search algorithm with progressive widening as our adversarial reinforcement learner. The performance is compared to direct Monte Carlo simulations and to the cross-entropy method as an alternative importance sampling baseline. The goal is to find potential problems otherwise not found by traditional requirements-based testing. Results indicate that our adaptive stress testing approach finds more failures and finds failures with higher likelihood relative to the baseline approaches.

* 10 pages, 10 figures, 6 algorithms. Digital Avionics Systems Conference (DASC) 2020

Via

Access Paper or Ask Questions

A Survey of Algorithms for Black-Box Safety Validation

May 06, 2020

Anthony Corso, Robert J. Moss, Mark Koren, Ritchie Lee, Mykel J. Kochenderfer

Figure 1 for A Survey of Algorithms for Black-Box Safety Validation

Figure 2 for A Survey of Algorithms for Black-Box Safety Validation

Abstract:Autonomous and semi-autonomous systems for safety-critical applications require rigorous testing before deployment. Due to the complexity of these systems, formal verification may be impossible and real-world testing may be dangerous during development. Therefore, simulation-based techniques have been developed that treat the system under test as a black box during testing. Safety validation tasks include finding disturbances to the system that cause it to fail (falsification), finding the most-likely failure, and estimating the probability that the system fails. Motivated by the prevalence of safety-critical artificial intelligence, this work provides a survey of state-of-the-art safety validation techniques with a focus on applied algorithms and their modifications for the safety validation problem. We present and discuss algorithms in the domains of optimization, path planning, reinforcement learning, and importance sampling. Problem decomposition techniques are presented to help scale algorithms to large state spaces, and a brief overview of safety-critical applications is given, including autonomous vehicles and aircraft collision avoidance systems. Finally, we present a survey of existing academic and commercially available safety validation tools.

Via

Access Paper or Ask Questions

Scalable Autonomous Vehicle Safety Validation through Dynamic Programming and Scene Decomposition

Apr 14, 2020

Anthony Corso, Ritchie Lee, Mykel J. Kochenderfer

Figure 1 for Scalable Autonomous Vehicle Safety Validation through Dynamic Programming and Scene Decomposition

Figure 2 for Scalable Autonomous Vehicle Safety Validation through Dynamic Programming and Scene Decomposition

Figure 3 for Scalable Autonomous Vehicle Safety Validation through Dynamic Programming and Scene Decomposition

Figure 4 for Scalable Autonomous Vehicle Safety Validation through Dynamic Programming and Scene Decomposition

Abstract:An open question in autonomous driving is how best to use simulation to validate the safety of autonomous vehicles. Existing techniques rely on simulated rollouts, which can be inefficient for finding rare failure events, while other techniques are designed to only discover a single failure. In this work, we present a new safety validation approach that attempts to estimate the distribution over failures of an autonomous policy using approximate dynamic programming. Knowledge of this distribution allows for the efficient discovery of many failure examples. To address the problem of scalability, we decompose complex driving scenarios into subproblems consisting of only the ego vehicle and one other vehicle. These subproblems can be solved with approximate dynamic programming and their solutions are recombined to approximate the solution to the full scenario. We apply our approach to a simple two-vehicle scenario to demonstrate the technique as well as a more complex five-vehicle scenario to demonstrate scalability. In both experiments, we observed an order of magnitude increase in the number of failures discovered compared to baseline approaches.

Via

Access Paper or Ask Questions

Validation of Image-Based Neural Network Controllers through Adaptive Stress Testing

Mar 05, 2020

Kyle D. Julian, Ritchie Lee, Mykel J. Kochenderfer

Figure 1 for Validation of Image-Based Neural Network Controllers through Adaptive Stress Testing

Figure 2 for Validation of Image-Based Neural Network Controllers through Adaptive Stress Testing

Figure 3 for Validation of Image-Based Neural Network Controllers through Adaptive Stress Testing

Figure 4 for Validation of Image-Based Neural Network Controllers through Adaptive Stress Testing

Abstract:Neural networks have become state-of-the-art for computer vision problems because of their ability to efficiently model complex functions from large amounts of data. While neural networks can be shown to perform well empirically for a variety of tasks, their performance is difficult to guarantee. Neural network verification tools have been developed that can certify robustness with respect to a given input image; however, for neural network systems used in closed-loop controllers, robustness with respect to individual images does not address multi-step properties of the neural network controller and its environment. Furthermore, neural network systems interacting in the physical world and using natural images are operating in a black-box environment, making formal verification intractable. This work combines the adaptive stress testing (AST) framework with neural network verification tools to search for the most likely sequence of image disturbances that cause the neural network controlled system to reach a failure. An autonomous aircraft taxi application is presented, and results show that the AST method finds failures with more likely image disturbances than baseline methods. Further analysis of AST results revealed an explainable cause of the failure, giving insight into the problematic scenarios that should be addressed.

* 7 pages, 6 figures

Via

Access Paper or Ask Questions

Adaptive Stress Testing for Autonomous Vehicles

Feb 05, 2019

Mark Koren, Saud Alsaif, Ritchie Lee, Mykel J. Kochenderfer

Figure 1 for Adaptive Stress Testing for Autonomous Vehicles

Figure 2 for Adaptive Stress Testing for Autonomous Vehicles

Figure 3 for Adaptive Stress Testing for Autonomous Vehicles

Figure 4 for Adaptive Stress Testing for Autonomous Vehicles

Abstract:This paper presents a method for testing the decision making systems of autonomous vehicles. Our approach involves perturbing stochastic elements in the vehicle's environment until the vehicle is involved in a collision. Instead of applying direct Monte Carlo sampling to find collision scenarios, we formulate the problem as a Markov decision process and use reinforcement learning algorithms to find the most likely failure scenarios. This paper presents Monte Carlo Tree Search (MCTS) and Deep Reinforcement Learning (DRL) solutions that can scale to large environments. We show that DRL can find more likely failure scenarios than MCTS with fewer calls to the simulator. A simulation scenario involving a vehicle approaching a crosswalk is used to validate the framework. Our proposed approach is very general and can be easily applied to other scenarios given the appropriate models of the vehicle and the environment.

Via

Access Paper or Ask Questions

Adaptive Stress Testing: Finding Failure Events with Reinforcement Learning

Nov 06, 2018

Ritchie Lee, Ole J. Mengshoel, Anshu Saksena, Ryan Gardner, Daniel Genin, Joshua Silbermann, Michael Owen, Mykel J. Kochenderfer

Figure 1 for Adaptive Stress Testing: Finding Failure Events with Reinforcement Learning

Figure 2 for Adaptive Stress Testing: Finding Failure Events with Reinforcement Learning

Figure 3 for Adaptive Stress Testing: Finding Failure Events with Reinforcement Learning

Figure 4 for Adaptive Stress Testing: Finding Failure Events with Reinforcement Learning

Abstract:Finding the most likely path to a set of failure states is important to the analysis of safety-critical dynamic systems. While efficient solutions exist for certain classes of systems, a scalable general solution for stochastic, partially-observable, and continuous-valued systems remains challenging. Existing approaches in formal and simulation-based methods either cannot scale to large systems or are computationally inefficient. This paper presents adaptive stress testing (AST), a framework for searching a simulator for the most likely path to a failure event. We formulate the problem as a Markov decision process and use reinforcement learning to optimize it. The approach is simulation-based and does not require internal knowledge of the system. As a result, the approach is very suitable for black box testing of large systems. We present formulations for both systems where the state is fully-observable and partially-observable. In the latter case, we present a modified Monte Carlo tree search algorithm that only requires access to the pseudorandom number generator of the simulator to overcome partial observability. We also present an extension of the framework, called differential adaptive stress testing (DAST), that can be used to find failures that occur in one system but not in another. This type of differential analysis is useful in applications such as regression testing, where one is concerned with finding areas of relative weakness compared to a baseline. We demonstrate the effectiveness of the approach on an aircraft collision avoidance application, where we stress test a prototype aircraft collision avoidance system to find high-probability scenarios of near mid-air collisions.

* 28 pages, 13 figures

Via

Access Paper or Ask Questions

Interpretable Categorization of Heterogeneous Time Series Data

Jan 26, 2018

Ritchie Lee, Mykel J. Kochenderfer, Ole J. Mengshoel, Joshua Silbermann

Figure 1 for Interpretable Categorization of Heterogeneous Time Series Data

Figure 2 for Interpretable Categorization of Heterogeneous Time Series Data

Figure 3 for Interpretable Categorization of Heterogeneous Time Series Data

Figure 4 for Interpretable Categorization of Heterogeneous Time Series Data

Abstract:Understanding heterogeneous multivariate time series data is important in many applications ranging from smart homes to aviation. Learning models of heterogeneous multivariate time series that are also human-interpretable is challenging and not adequately addressed by the existing literature. We propose grammar-based decision trees (GBDTs) and an algorithm for learning them. GBDTs extend decision trees with a grammar framework. Logical expressions derived from a context-free grammar are used for branching in place of simple thresholds on attributes. The added expressivity enables support for a wide range of data types while retaining the interpretability of decision trees. In particular, when a grammar based on temporal logic is used, we show that GBDTs can be used for the interpretable classi cation of high-dimensional and heterogeneous time series data. Furthermore, we show how GBDTs can also be used for categorization, which is a combination of clustering and generating interpretable explanations for each cluster. We apply GBDTs to analyze the classic Australian Sign Language dataset as well as data on near mid-air collisions (NMACs). The NMAC data comes from aircraft simulations used in the development of the next-generation Airborne Collision Avoidance System (ACAS X).

* 9 pages, 5 figures, 2 tables, SIAM International Conference on Data Mining (SDM) 2018

Via

Access Paper or Ask Questions

Predicting the behavior of interacting humans by fusing data from multiple sources

Aug 09, 2014

Erik J. Schlicht, Ritchie Lee, David H. Wolpert, Mykel J. Kochenderfer, Brendan Tracey

Figure 1 for Predicting the behavior of interacting humans by fusing data from multiple sources

Figure 2 for Predicting the behavior of interacting humans by fusing data from multiple sources

Figure 3 for Predicting the behavior of interacting humans by fusing data from multiple sources

Figure 4 for Predicting the behavior of interacting humans by fusing data from multiple sources

Abstract:Multi-fidelity methods combine inexpensive low-fidelity simulations with costly but highfidelity simulations to produce an accurate model of a system of interest at minimal cost. They have proven useful in modeling physical systems and have been applied to engineering problems such as wing-design optimization. During human-in-the-loop experimentation, it has become increasingly common to use online platforms, like Mechanical Turk, to run low-fidelity experiments to gather human performance data in an efficient manner. One concern with these experiments is that the results obtained from the online environment generalize poorly to the actual domain of interest. To address this limitation, we extend traditional multi-fidelity approaches to allow us to combine fewer data points from high-fidelity human-in-the-loop experiments with plentiful but less accurate data from low-fidelity experiments to produce accurate models of how humans interact. We present both model-based and model-free methods, and summarize the predictive performance of each method under dierent conditions.

* Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

Via

Access Paper or Ask Questions