Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alexandre M. Bayen

(U)NFV: Supervised and Unsupervised Neural Finite Volume Methods for Solving Hyperbolic PDEs

May 29, 2025

Nathan Lichtlé, Alexi Canesse, Zhe Fu, Hossein Nick Zinat Matin, Maria Laura Delle Monache, Alexandre M. Bayen

Abstract:We introduce (U)NFV, a modular neural network architecture that generalizes classical finite volume (FV) methods for solving hyperbolic conservation laws. Hyperbolic partial differential equations (PDEs) are challenging to solve, particularly conservation laws whose physically relevant solutions contain shocks and discontinuities. FV methods are widely used for their mathematical properties: convergence to entropy solutions, flow conservation, or total variation diminishing, but often lack accuracy and flexibility in complex settings. Neural Finite Volume addresses these limitations by learning update rules over extended spatial and temporal stencils while preserving conservation structure. It supports both supervised training on solution data (NFV) and unsupervised training via weak-form residual loss (UNFV). Applied to first-order conservation laws, (U)NFV achieves up to 10x lower error than Godunov's method, outperforms ENO/WENO, and rivals discontinuous Galerkin solvers with far less complexity. On traffic modeling problems, both from PDEs and from experimental highway data, (U)NFV captures nonlinear wave dynamics with significantly higher fidelity and scalability than traditional FV approaches.

Via

Access Paper or Ask Questions

Pareto Control Barrier Function for Inner Safe Set Maximization Under Input Constraints

Oct 05, 2024

Xiaoyang Cao, Zhe Fu, Alexandre M. Bayen

Figure 1 for Pareto Control Barrier Function for Inner Safe Set Maximization Under Input Constraints

Figure 2 for Pareto Control Barrier Function for Inner Safe Set Maximization Under Input Constraints

Figure 3 for Pareto Control Barrier Function for Inner Safe Set Maximization Under Input Constraints

Figure 4 for Pareto Control Barrier Function for Inner Safe Set Maximization Under Input Constraints

Abstract:This article introduces the Pareto Control Barrier Function (PCBF) algorithm to maximize the inner safe set of dynamical systems under input constraints. Traditional Control Barrier Functions (CBFs) ensure safety by maintaining system trajectories within a safe set but often fail to account for realistic input constraints. To address this problem, we leverage the Pareto multi-task learning framework to balance competing objectives of safety and safe set volume. The PCBF algorithm is applicable to high-dimensional systems and is computationally efficient. We validate its effectiveness through comparison with Hamilton-Jacobi reachability for an inverted pendulum and through simulations on a 12-dimensional quadrotor system. Results show that the PCBF consistently outperforms existing methods, yielding larger safe sets and ensuring safety under input constraints.

* Submitted to ACC 2025

Via

Access Paper or Ask Questions

Reinforcement Learning Based Oscillation Dampening: Scaling up Single-Agent RL algorithms to a 100 AV highway field operational test

Feb 26, 2024

Kathy Jang, Nathan Lichtlé, Eugene Vinitsky, Adit Shah, Matthew Bunting, Matthew Nice, Benedetto Piccoli, Benjamin Seibold, Daniel B. Work, Maria Laura Delle Monache(+3 more)

Abstract:In this article, we explore the technical details of the reinforcement learning (RL) algorithms that were deployed in the largest field test of automated vehicles designed to smooth traffic flow in history as of 2023, uncovering the challenges and breakthroughs that come with developing RL controllers for automated vehicles. We delve into the fundamental concepts behind RL algorithms and their application in the context of self-driving cars, discussing the developmental process from simulation to deployment in detail, from designing simulators to reward function shaping. We present the results in both simulation and deployment, discussing the flow-smoothing benefits of the RL controller. From understanding the basics of Markov decision processes to exploring advanced techniques such as deep RL, our article offers a comprehensive overview and deep dive of the theoretical foundations and practical implementations driving this rapidly evolving field. We also showcase real-world case studies and alternative research projects that highlight the impact of RL controllers in revolutionizing autonomous driving. From tackling complex urban environments to dealing with unpredictable traffic scenarios, these intelligent controllers are pushing the boundaries of what automated vehicles can achieve. Furthermore, we examine the safety considerations and hardware-focused technical details surrounding deployment of RL controllers into automated vehicles. As these algorithms learn and evolve through interactions with the environment, ensuring their behavior aligns with safety standards becomes crucial. We explore the methodologies and frameworks being developed to address these challenges, emphasizing the importance of building reliable control systems for automated vehicles.

Via

Access Paper or Ask Questions

Traffic Smoothing Controllers for Autonomous Vehicles Using Deep Reinforcement Learning and Real-World Trajectory Data

Jan 18, 2024

Nathan Lichtlé, Kathy Jang, Adit Shah, Eugene Vinitsky, Jonathan W. Lee, Alexandre M. Bayen

Abstract:Designing traffic-smoothing cruise controllers that can be deployed onto autonomous vehicles is a key step towards improving traffic flow, reducing congestion, and enhancing fuel efficiency in mixed autonomy traffic. We bypass the common issue of having to carefully fine-tune a large traffic microsimulator by leveraging real-world trajectory data from the I-24 highway in Tennessee, replayed in a one-lane simulation. Using standard deep reinforcement learning methods, we train energy-reducing wave-smoothing policies. As an input to the agent, we observe the speed and distance of only the vehicle in front, which are local states readily available on most recent vehicles, as well as non-local observations about the downstream state of the traffic. We show that at a low 4% autonomous vehicle penetration rate, we achieve significant fuel savings of over 15% on trajectories exhibiting many stop-and-go waves. Finally, we analyze the smoothing effect of the controllers and demonstrate robustness to adding lane-changing into the simulation as well as the removal of downstream information.

* Accepted to be published as part of the 26th IEEE International Conference on Intelligent Transportation Systems (ITSC) 2023, Bilbao, Spain, September 24-28, 2023

Via

Access Paper or Ask Questions

Unified Automatic Control of Vehicular Systems with Reinforcement Learning

Jul 30, 2022

Zhongxia Yan, Abdul Rahman Kreidieh, Eugene Vinitsky, Alexandre M. Bayen, Cathy Wu

Figure 1 for Unified Automatic Control of Vehicular Systems with Reinforcement Learning

Figure 2 for Unified Automatic Control of Vehicular Systems with Reinforcement Learning

Figure 3 for Unified Automatic Control of Vehicular Systems with Reinforcement Learning

Figure 4 for Unified Automatic Control of Vehicular Systems with Reinforcement Learning

Abstract:Emerging vehicular systems with increasing proportions of automated components present opportunities for optimal control to mitigate congestion and increase efficiency. There has been a recent interest in applying deep reinforcement learning (DRL) to these nonlinear dynamical systems for the automatic design of effective control strategies. Despite conceptual advantages of DRL being model-free, studies typically nonetheless rely on training setups that are painstakingly specialized to specific vehicular systems. This is a key challenge to efficient analysis of diverse vehicular and mobility systems. To this end, this article contributes a streamlined methodology for vehicular microsimulation and discovers high performance control strategies with minimal manual design. A variable-agent, multi-task approach is presented for optimization of vehicular Partially Observed Markov Decision Processes. The methodology is experimentally validated on mixed autonomy traffic systems, where fractions of vehicles are automated; empirical improvement, typically 15-60% over a human driving baseline, is observed in all configurations of six diverse open or closed traffic systems. The study reveals numerous emergent behaviors resembling wave mitigation, traffic signaling, and ramp metering. Finally, the emergent behaviors are analyzed to produce interpretable control strategies, which are validated against the learned control strategies.

* 16 pages, 14 figures, IEEE Transactions on Automation Science and Engineering (T-ASE), 2022

Via

Access Paper or Ask Questions

Reachability Analysis for FollowerStopper: Safety Analysis and Experimental Results

Dec 29, 2021

Fang-Chieh Chou, Marsalis Gibson, Rahul Bhadani, Alexandre M. Bayen, Jonathan Sprinkle

Figure 1 for Reachability Analysis for FollowerStopper: Safety Analysis and Experimental Results

Figure 2 for Reachability Analysis for FollowerStopper: Safety Analysis and Experimental Results

Figure 3 for Reachability Analysis for FollowerStopper: Safety Analysis and Experimental Results

Figure 4 for Reachability Analysis for FollowerStopper: Safety Analysis and Experimental Results

Abstract:Motivated by earlier work and the developer of a new algorithm, the FollowerStopper, this article uses reachability analysis to verify the safety of the FollowerStopper algorithm, which is a controller designed for dampening stop- and-go traffic waves. With more than 1100 miles of driving data collected by our physical platform, we validate our analysis results by comparing it to human driving behaviors. The FollowerStopper controller has been demonstrated to dampen stop-and-go traffic waves at low speed, but previous analysis on its relative safety has been limited to upper and lower bounds of acceleration. To expand upon previous analysis, reachability analysis is used to investigate the safety at the speeds it was originally tested and also at higher speeds. Two formulations of safety analysis with different criteria are shown: distance-based and time headway-based. The FollowerStopper is considered safe with distance-based criterion. However, simulation results demonstrate that the FollowerStopper is not representative of human drivers - it follows too closely behind vehicles, specifically at a distance human would deem as unsafe. On the other hand, under the time headway-based safety analysis, the FollowerStopper is not considered safe anymore. A modified FollowerStopper is proposed to satisfy time-based safety criterion. Simulation results of the proposed FollowerStopper shows that its response represents human driver behavior better.

* 6 pages; 10 figures; ICRA publication

Via

Access Paper or Ask Questions

Failout: Achieving Failure-Resilient Inference in Distributed Neural Networks

Feb 18, 2020

Ashkan Yousefpour, Brian Q. Nguyen, Siddartha Devic, Guanhua Wang, Aboudy Kreidieh, Hans Lobel, Alexandre M. Bayen, Jason P. Jue

Figure 1 for Failout: Achieving Failure-Resilient Inference in Distributed Neural Networks

Figure 2 for Failout: Achieving Failure-Resilient Inference in Distributed Neural Networks

Figure 3 for Failout: Achieving Failure-Resilient Inference in Distributed Neural Networks

Figure 4 for Failout: Achieving Failure-Resilient Inference in Distributed Neural Networks

Abstract:When a neural network is partitioned and distributed across physical nodes, failure of physical nodes causes the failure of the neural units that are placed on those nodes, which results in a significant performance drop. Current approaches focus on resiliency of training in distributed neural networks. However, resiliency of inference in distributed neural networks is less explored. We introduce ResiliNet, a scheme for making inference in distributed neural networks resilient to physical node failures. ResiliNet combines two concepts to provide resiliency: skip connection in residual neural networks, and a novel technique called failout, which is introduced in this paper. Failout simulates physical node failure conditions during training using dropout, and is specifically designed to improve the resiliency of distributed neural networks. The results of the experiments and ablation studies using three datasets confirm the ability of ResiliNet to provide inference resiliency for distributed neural networks.

* 10 pages

Via

Access Paper or Ask Questions

Inter-Level Cooperation in Hierarchical Reinforcement Learning

Dec 05, 2019

Abdul Rahman Kreidieh, Samyak Parajuli, Nathan Lichtle, Yiling You, Rayyan Nasr, Alexandre M. Bayen

Figure 1 for Inter-Level Cooperation in Hierarchical Reinforcement Learning

Figure 2 for Inter-Level Cooperation in Hierarchical Reinforcement Learning

Figure 3 for Inter-Level Cooperation in Hierarchical Reinforcement Learning

Figure 4 for Inter-Level Cooperation in Hierarchical Reinforcement Learning

Abstract:This article presents a novel algorithm for promoting cooperation between internal actors in a goal-conditioned hierarchical reinforcement learning (HRL) policy. Current techniques for HRL policy optimization treat the higher and lower level policies as separate entities which are trained to maximize different objective functions, rendering the HRL problem formulation more similar to a general sum game than a single-agent task. Within this setting, we hypothesize that improved cooperation between the internal agents of a hierarchy can simplify the credit assignment problem from the perspective of the high-level policies, thereby leading to significant improvements to training in situations where intricate sets of action primitives must be performed to yield improvements in performance. In order to promote cooperation within this setting, we propose the inclusion of a connected gradient term to the gradient computations of the higher level policies. Our method is demonstrated to achieve superior results to existing techniques in a set of difficult long time horizon tasks.

Via

Access Paper or Ask Questions

Guardians of the Deep Fog: Failure-Resilient DNN Inference from Edge to Cloud

Sep 21, 2019

Ashkan Yousefpour, Siddartha Devic, Brian Q. Nguyen, Aboudy Kreidieh, Alan Liao, Alexandre M. Bayen, Jason P. Jue

Figure 1 for Guardians of the Deep Fog: Failure-Resilient DNN Inference from Edge to Cloud

Figure 2 for Guardians of the Deep Fog: Failure-Resilient DNN Inference from Edge to Cloud

Figure 3 for Guardians of the Deep Fog: Failure-Resilient DNN Inference from Edge to Cloud

Figure 4 for Guardians of the Deep Fog: Failure-Resilient DNN Inference from Edge to Cloud

Abstract:Partitioning and distributing deep neural networks (DNNs) over physical nodes such as edge, fog, or cloud nodes, could enhance sensor fusion, and reduce bandwidth and inference latency. However, when a DNN is distributed over physical nodes, failure of the physical nodes causes the failure of the DNN units that are placed on these nodes. The performance of the inference task will be unpredictable, and most likely, poor, if the distributed DNN is not specifically designed and properly trained for failures. Motivated by this, we introduce deepFogGuard, a DNN architecture augmentation scheme for making the distributed DNN inference task failure-resilient. To articulate deepFogGuard, we introduce the elements and a model for the resiliency of distributed DNN inference. Inspired by the concept of residual connections in DNNs, we introduce skip hyperconnections in distributed DNNs, which are the basis of deepFogGuard's design to provide resiliency. Next, our extensive experiments using two existing datasets for the sensing and vision applications confirm the ability of deepFogGuard to provide resiliency for distributed DNNs in edge-cloud networks.

* Accepted to ACM AIChallengeIoT 2019

Via

Access Paper or Ask Questions

Expert Level control of Ramp Metering based on Multi-task Deep Reinforcement Learning

Jan 30, 2017

Francois Belletti, Daniel Haziza, Gabriel Gomes, Alexandre M. Bayen

Figure 1 for Expert Level control of Ramp Metering based on Multi-task Deep Reinforcement Learning

Figure 2 for Expert Level control of Ramp Metering based on Multi-task Deep Reinforcement Learning

Figure 3 for Expert Level control of Ramp Metering based on Multi-task Deep Reinforcement Learning

Figure 4 for Expert Level control of Ramp Metering based on Multi-task Deep Reinforcement Learning

Abstract:This article shows how the recent breakthroughs in Reinforcement Learning (RL) that have enabled robots to learn to play arcade video games, walk or assemble colored bricks, can be used to perform other tasks that are currently at the core of engineering cyberphysical systems. We present the first use of RL for the control of systems modeled by discretized non-linear Partial Differential Equations (PDEs) and devise a novel algorithm to use non-parametric control techniques for large multi-agent systems. We show how neural network based RL enables the control of discretized PDEs whose parameters are unknown, random, and time-varying. We introduce an algorithm of Mutual Weight Regularization (MWR) which alleviates the curse of dimensionality of multi-agent control schemes by sharing experience between agents while giving each agent the opportunity to specialize its action policy so as to tailor it to the local parameters of the part of the system it is located in.

Via

Access Paper or Ask Questions