Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Esen Yel

Entropy-regularized Point-based Value Iteration

Feb 14, 2024

Harrison Delecki, Marcell Vazquez-Chanlatte, Esen Yel, Kyle Wray, Tomer Arnon, Stefan Witwicki, Mykel J. Kochenderfer

Figure 1 for Entropy-regularized Point-based Value Iteration

Figure 2 for Entropy-regularized Point-based Value Iteration

Figure 3 for Entropy-regularized Point-based Value Iteration

Figure 4 for Entropy-regularized Point-based Value Iteration

Abstract:Model-based planners for partially observable problems must accommodate both model uncertainty during planning and goal uncertainty during objective inference. However, model-based planners may be brittle under these types of uncertainty because they rely on an exact model and tend to commit to a single optimal behavior. Inspired by results in the model-free setting, we propose an entropy-regularized model-based planner for partially observable problems. Entropy regularization promotes policy robustness for planning and objective inference by encouraging policies to be no more committed to a single action than necessary. We evaluate the robustness and objective inference performance of entropy-regularized policies in three problem domains. Our results show that entropy-regularized policies outperform non-entropy-regularized baselines in terms of higher expected returns under modeling errors and higher accuracy during objective inference.

Via

Access Paper or Ask Questions

Predicting Future Spatiotemporal Occupancy Grids with Semantics for Autonomous Driving

Oct 03, 2023

Maneekwan Toyungyernsub, Esen Yel, Jiachen Li, Mykel J. Kochenderfer

Abstract:For autonomous vehicles to proactively plan safe trajectories and make informed decisions, they must be able to predict the future occupancy states of the local environment. However, common issues with occupancy prediction include predictions where moving objects vanish or become blurred, particularly at longer time horizons. We propose an environment prediction framework that incorporates environment semantics for future occupancy prediction. Our method first semantically segments the environment and uses this information along with the occupancy information to predict the spatiotemporal evolution of the environment. We validate our approach on the real-world Waymo Open Dataset. Compared to baseline methods, our model has higher prediction accuracy and is capable of maintaining moving object appearances in the predictions for longer prediction time horizons.

* 7 pages, 5 figures

Via

Access Paper or Ask Questions

Efficient Determination of Safety Requirements for Perception Systems

Jul 03, 2023

Sydney M. Katz, Anthony L. Corso, Esen Yel, Mykel J. Kochenderfer

Abstract:Perception systems operate as a subcomponent of the general autonomy stack, and perception system designers often need to optimize performance characteristics while maintaining safety with respect to the overall closed-loop system. For this reason, it is useful to distill high-level safety requirements into component-level requirements on the perception system. In this work, we focus on efficiently determining sets of safe perception system performance characteristics given a black-box simulator of the fully-integrated, closed-loop system. We combine the advantages of common black-box estimation techniques such as Gaussian processes and threshold bandits to develop a new estimation method, which we call smoothing bandits. We demonstrate our method on a vision-based aircraft collision avoidance problem and show improvements in terms of both accuracy and efficiency over the Gaussian process and threshold bandit baselines.

* 10 pages, 14 figures, submitted to the 2023 Digital Avionics Systems Conference

Via

Access Paper or Ask Questions

Experience Filter: Using Past Experiences on Unseen Tasks or Environments

May 29, 2023

Anil Yildiz, Esen Yel, Anthony L. Corso, Kyle H. Wray, Stefan J. Witwicki, Mykel J. Kochenderfer

Abstract:One of the bottlenecks of training autonomous vehicle (AV) agents is the variability of training environments. Since learning optimal policies for unseen environments is often very costly and requires substantial data collection, it becomes computationally intractable to train the agent on every possible environment or task the AV may encounter. This paper introduces a zero-shot filtering approach to interpolate learned policies of past experiences to generalize to unseen ones. We use an experience kernel to correlate environments. These correlations are then exploited to produce policies for new tasks or environments from learned policies. We demonstrate our methods on an autonomous vehicle driving through T-intersections with different characteristics, where its behavior is modeled as a partially observable Markov decision process (POMDP). We first construct compact representations of learned policies for POMDPs with unknown transition functions given a dataset of sequential actions and observations. Then, we filter parameterized policies of previously visited environments to generate policies to new, unseen environments. We demonstrate our approaches on both an actual AV and a high-fidelity simulator. Results indicate that our experience filter offers a fast, low-effort, and near-optimal solution to create policies for tasks or environments never seen before. Furthermore, the generated new policies outperform the policy learned using the entire data collected from past environments, suggesting that the correlation among different environments can be exploited and irrelevant ones can be filtered out.

* Accepted at IEEE Intelligent Vehicles Symposium (IV) 2023

Via

Access Paper or Ask Questions

Backward Reachability Analysis of Neural Feedback Loops: Techniques for Linear and Nonlinear Systems

Sep 28, 2022

Nicholas Rober, Sydney M. Katz, Chelsea Sidrane, Esen Yel, Michael Everett, Mykel J. Kochenderfer, Jonathan P. How

Figure 1 for Backward Reachability Analysis of Neural Feedback Loops: Techniques for Linear and Nonlinear Systems

Figure 2 for Backward Reachability Analysis of Neural Feedback Loops: Techniques for Linear and Nonlinear Systems

Figure 3 for Backward Reachability Analysis of Neural Feedback Loops: Techniques for Linear and Nonlinear Systems

Figure 4 for Backward Reachability Analysis of Neural Feedback Loops: Techniques for Linear and Nonlinear Systems

Abstract:The increasing prevalence of neural networks (NNs) in safety-critical applications calls for methods to certify safe behavior. This paper presents a backward reachability approach for safety verification of neural feedback loops (NFLs), i.e., closed-loop systems with NN control policies. While recent works have focused on forward reachability as a strategy for safety certification of NFLs, backward reachability offers advantages over the forward strategy, particularly in obstacle avoidance scenarios. Prior works have developed techniques for backward reachability analysis for systems without NNs, but the presence of NNs in the feedback loop presents a unique set of problems due to the nonlinearities in their activation functions and because NN models are generally not invertible. To overcome these challenges, we use existing forward NN analysis tools to efficiently find an over-approximation of the backprojection (BP) set, i.e., the set of states for which the NN control policy will drive the system to a given target set. We present frameworks for calculating BP over-approximations for both linear and nonlinear systems with control policies represented by feedforward NNs and propose computationally efficient strategies. We use numerical results from a variety of models to showcase the proposed algorithms, including a demonstration of safety certification for a 6D system.

* 14 pages, 14 figures. arXiv admin note: substantial text overlap with arXiv:2204.08319

Via

Access Paper or Ask Questions

Dynamics-Aware Spatiotemporal Occupancy Prediction in Urban Environments

Sep 27, 2022

Maneekwan Toyungyernsub, Esen Yel, Jiachen Li, Mykel J. Kochenderfer

Figure 1 for Dynamics-Aware Spatiotemporal Occupancy Prediction in Urban Environments

Figure 2 for Dynamics-Aware Spatiotemporal Occupancy Prediction in Urban Environments

Figure 3 for Dynamics-Aware Spatiotemporal Occupancy Prediction in Urban Environments

Figure 4 for Dynamics-Aware Spatiotemporal Occupancy Prediction in Urban Environments

Abstract:Detection and segmentation of moving obstacles, along with prediction of the future occupancy states of the local environment, are essential for autonomous vehicles to proactively make safe and informed decisions. In this paper, we propose a framework that integrates the two capabilities together using deep neural network architectures. Our method first detects and segments moving objects in the scene, and uses this information to predict the spatiotemporal evolution of the environment around autonomous vehicles. To address the problem of direct integration of both static-dynamic object segmentation and environment prediction models, we propose using occupancy-based environment representations across the whole framework. Our method is validated on the real-world Waymo Open Dataset and demonstrates higher prediction accuracy than baseline methods.

* Accepted at 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022)

Via

Access Paper or Ask Questions

Learning Enabled Fast Planning and Control in Dynamic Environments with Intermittent Information

Sep 09, 2022

Matthew Cleaveland, Esen Yel, Yiannis Kantaros, Insup Lee, Nicola Bezzo

Figure 1 for Learning Enabled Fast Planning and Control in Dynamic Environments with Intermittent Information

Figure 2 for Learning Enabled Fast Planning and Control in Dynamic Environments with Intermittent Information

Figure 3 for Learning Enabled Fast Planning and Control in Dynamic Environments with Intermittent Information

Figure 4 for Learning Enabled Fast Planning and Control in Dynamic Environments with Intermittent Information

Abstract:This paper addresses a safe planning and control problem for mobile robots operating in communication- and sensor-limited dynamic environments. In this case the robots cannot sense the objects around them and must instead rely on intermittent, external information about the environment, as e.g., in underwater applications. The challenge in this case is that the robots must plan using only this stale data, while accounting for any noise in the data or uncertainty in the environment. To address this challenge we propose a compositional technique which leverages neural networks to quickly plan and control a robot through crowded and dynamic environments using only intermittent information. Specifically, our tool uses reachability analysis and potential fields to train a neural network that is capable of generating safe control actions. We demonstrate our technique both in simulation with an underwater vehicle crossing a crowded shipping channel and with real experiments with ground vehicles in communication- and sensor-limited environments.

Via

Access Paper or Ask Questions

Uncertainty-Aware Online Merge Planning with Learned Driver Behavior

Jul 11, 2022

Liam A. Kruse, Esen Yel, Ransalu Senanayake, Mykel J. Kochenderfer

Figure 1 for Uncertainty-Aware Online Merge Planning with Learned Driver Behavior

Figure 2 for Uncertainty-Aware Online Merge Planning with Learned Driver Behavior

Figure 3 for Uncertainty-Aware Online Merge Planning with Learned Driver Behavior

Figure 4 for Uncertainty-Aware Online Merge Planning with Learned Driver Behavior

Abstract:Safe and reliable autonomy solutions are a critical component of next-generation intelligent transportation systems. Autonomous vehicles in such systems must reason about complex and dynamic driving scenes in real time and anticipate the behavior of nearby drivers. Human driving behavior is highly nuanced and specific to individual traffic participants. For example, drivers might display cooperative or non-cooperative behaviors in the presence of merging vehicles. These behaviors must be estimated and incorporated in the planning process for safe and efficient driving. In this work, we present a framework for estimating the cooperation level of drivers on a freeway and plan merging maneuvers with the drivers' latent behaviors explicitly modeled. The latent parameter estimation problem is solved using a particle filter to approximate the probability distribution over the cooperation level. A partially observable Markov decision process (POMDP) that includes the latent state estimate is solved online to extract a policy for a merging vehicle. We evaluate our method in a high-fidelity automotive simulator against methods that are agnostic to latent states or rely on $\textit{a priori}$ assumptions about actor behavior.

Via

Access Paper or Ask Questions

A Meta-Learning-based Trajectory Tracking Framework for UAVs under Degraded Conditions

Apr 30, 2021

Esen Yel, Nicola Bezzo

Figure 1 for A Meta-Learning-based Trajectory Tracking Framework for UAVs under Degraded Conditions

Figure 2 for A Meta-Learning-based Trajectory Tracking Framework for UAVs under Degraded Conditions

Figure 3 for A Meta-Learning-based Trajectory Tracking Framework for UAVs under Degraded Conditions

Figure 4 for A Meta-Learning-based Trajectory Tracking Framework for UAVs under Degraded Conditions

Abstract:Due to changes in model dynamics or unexpected disturbances, an autonomous robotic system may experience unforeseen challenges during real-world operations which may affect its safety and intended behavior: in particular actuator and system failures and external disturbances are among the most common causes of degraded mode of operation. To deal with this problem, in this work, we present a meta-learning-based approach to improve the trajectory tracking performance of an unmanned aerial vehicle (UAV) under actuator faults and disturbances which have not been previously experienced. Our approach leverages meta-learning to adapt the system's model at runtime to make accurate predictions about the system's future state. A runtime monitoring and validation technique is proposed to decide when the system needs to adapt its model by considering a data pruning procedure for efficient learning. Finally the desired trajectory is adapted based on future predictions by borrowing robust control logic to make the system track the original and desired path without needing to access the system's controller. The proposed framework is applied and validated in both simulations and experiments on a faulty UAV navigation case study demonstrating a drastic increase in tracking performance.

* 7 pages, 9 figures

Via

Access Paper or Ask Questions