Abstract:Predicting traffic flow in data-scarce cities is challenging due to limited historical data. To address this, we leverage transfer learning by identifying periodic patterns common to data-rich cities using a customized variant of Dynamic Mode Decomposition (DMD): constrained Hankelized DMD (TrHDMD). This method uncovers common eigenmodes (urban heartbeats) in traffic patterns and transfers them to data-scarce cities, significantly enhancing prediction performance. TrHDMD reduces the need for extensive training datasets by utilizing prior knowledge from other cities. By applying Koopman operator theory to multi-city loop detector data, we identify stable, interpretable, and time-invariant traffic modes. Injecting ``urban heartbeats'' into forecasting tasks improves prediction accuracy and has the potential to enhance traffic management strategies for cities with varying data infrastructures. Our work introduces cross-city knowledge transfer via shared Koopman eigenmodes, offering actionable insights and reliable forecasts for data-scarce urban environments.
Abstract:The strong performance of simple neural networks is often attributed to their nonlinear activations. However, a linear view of neural networks makes understanding and controlling networks much more approachable. We draw from a dynamical systems view of neural networks, offering a fresh perspective by using Koopman operator theory and its connections with dynamic mode decomposition (DMD). Together, they offer a framework for linearizing dynamical systems by embedding the system into an appropriate observable space. By reframing a neural network as a dynamical system, we demonstrate that we can replace the nonlinear layer in a pretrained multi-layer perceptron (MLP) with a finite-dimensional linear operator. In addition, we analyze the eigenvalues of DMD and the right singular vectors of SVD, to present evidence that time-delayed coordinates provide a straightforward and highly effective observable space for Koopman theory to linearize a network layer. Consequently, we replace layers of an MLP trained on the Yin-Yang dataset with predictions from a DMD model, achieving a mdoel accuracy of up to 97.3%, compared to the original 98.4%. In addition, we replace layers in an MLP trained on the MNIST dataset, achieving up to 95.8%, compared to the original 97.2% on the test set.
Abstract:Deep learning methods are emerging as popular computational tools for solving forward and inverse problems in traffic flow. In this paper, we study a neural operator framework for learning solutions to nonlinear hyperbolic partial differential equations with applications in macroscopic traffic flow models. In this framework, an operator is trained to map heterogeneous and sparse traffic input data to the complete macroscopic traffic state in a supervised learning setting. We chose a physics-informed Fourier neural operator ($\pi$-FNO) as the operator, where an additional physics loss based on a discrete conservation law regularizes the problem during training to improve the shock predictions. We also propose to use training data generated from random piecewise constant input data to systematically capture the shock and rarefied solutions. From experiments using the LWR traffic flow model, we found superior accuracy in predicting the density dynamics of a ring-road network and urban signalized road. We also found that the operator can be trained using simple traffic density dynamics, e.g., consisting of $2-3$ vehicle queues and $1-2$ traffic signal cycles, and it can predict density dynamics for heterogeneous vehicle queue distributions and multiple traffic signal cycles $(\geq 2)$ with an acceptable error. The extrapolation error grew sub-linearly with input complexity for a proper choice of the model architecture and training data. Adding a physics regularizer aided in learning long-term traffic density dynamics, especially for problems with periodic boundary data.
Abstract:Recent studies reveal that Autonomous Vehicles (AVs) can be manipulated by hidden backdoors, causing them to perform harmful actions when activated by physical triggers. However, it is still unclear how these triggers can be activated while adhering to traffic principles. Understanding this vulnerability in a dynamic traffic environment is crucial. This work addresses this gap by presenting physical trigger activation as a reachability problem of controlled dynamic system. Our technique identifies security-critical areas in traffic systems where trigger conditions for accidents can be reached, and provides intended trajectories for how those conditions can be reached. Testing on typical traffic scenarios showed the system can be successfully driven to trigger conditions with near 100% activation rate. Our method benefits from identifying AV vulnerability and enabling effective safety strategies.
Abstract:Deep Reinforcement Learning (DRL) enhances the efficiency of Autonomous Vehicles (AV), but also makes them susceptible to backdoor attacks that can result in traffic congestion or collisions. Backdoor functionality is typically incorporated by contaminating training datasets with covert malicious data to maintain high precision on genuine inputs while inducing the desired (malicious) outputs for specific inputs chosen by adversaries. Current defenses against backdoors mainly focus on image classification using image-based features, which cannot be readily transferred to the regression task of DRL-based AV controllers since the inputs are continuous sensor data, i.e., the combinations of velocity and distance of AV and its surrounding vehicles. Our proposed method adds well-designed noise to the input to neutralize backdoors. The approach involves learning an optimal smoothing (noise) distribution to preserve the normal functionality of genuine inputs while neutralizing backdoors. By doing so, the resulting model is expected to be more resilient against backdoor attacks while maintaining high accuracy on genuine inputs. The effectiveness of the proposed method is verified on a simulated traffic system based on a microscopic traffic simulator, where experimental results showcase that the smoothed traffic controller can neutralize all trigger samples and maintain the performance of relieving traffic congestion
Abstract:We study learning weak solutions to nonlinear hyperbolic partial differential equations (H-PDE), which have been difficult to learn due to discontinuities in their solutions. We use a physics-informed variant of the Fourier Neural Operator ($\pi$-FNO) to learn the weak solutions. We empirically quantify the generalization/out-of-sample error of the $\pi$-FNO solver as a function of input complexity, i.e., the distributions of initial and boundary conditions. Our testing results show that $\pi$-FNO generalizes well to unseen initial and boundary conditions. We find that the generalization error grows linearly with input complexity. Further, adding a physics-informed regularizer improved the prediction of discontinuities in the solution. We use the Lighthill-Witham-Richards (LWR) traffic flow model as a guiding example to illustrate the results.
Abstract:The adaptive smoothing method (ASM) is a standard data-driven technique used in traffic state estimation. The ASM has free parameters which, in practice, are chosen to be some generally acceptable values based on intuition. However, we note that the heuristically chosen values often result in un-physical predictions by the ASM. In this work, we propose a neural network based on the ASM which tunes those parameters automatically by learning from sparse data from road sensors. We refer to it as the adaptive smoothing neural network (ASNN). We also propose a modified ASNN (MASNN), which makes it a strong learner by using ensemble averaging. The ASNN and MASNN are trained and tested two real-world datasets. Our experiments reveal that the ASNN and the MASNN outperform the conventional ASM.
Abstract:Backdoor attacks impose a new threat in Deep Neural Networks (DNNs), where a backdoor is inserted into the neural network by poisoning the training dataset, misclassifying inputs that contain the adversary trigger. The major challenge for defending against these attacks is that only the attacker knows the secret trigger and the target class. The problem is further exacerbated by the recent introduction of "Hidden Triggers", where the triggers are carefully fused into the input, bypassing detection by human inspection and causing backdoor identification through anomaly detection to fail. To defend against such imperceptible attacks, in this work we systematically analyze how representations, i.e., the set of neuron activations for a given DNN when using the training data as inputs, are affected by backdoor attacks. We propose PiDAn, an algorithm based on coherence optimization purifying the poisoned data. Our analysis shows that representations of poisoned data and authentic data in the target class are still embedded in different linear subspaces, which implies that they show different coherence with some latent spaces. Based on this observation, the proposed PiDAn algorithm learns a sample-wise weight vector to maximize the projected coherence of weighted samples, where we demonstrate that the learned weight vector has a natural "grouping effect" and is distinguishable between authentic data and poisoned data. This enables the systematic detection and mitigation of backdoor attacks. Based on our theoretical analysis and experimental results, we demonstrate the effectiveness of PiDAn in defending against backdoor attacks that use different settings of poisoned samples on GTSRB and ILSVRC2012 datasets. Our PiDAn algorithm can detect more than 90% infected classes and identify 95% poisoned samples.
Abstract:Space-time visualizations of macroscopic or microscopic traffic variables is a qualitative tool used by traffic engineers to understand and analyze different aspects of road traffic dynamics. We present a deep learning method to learn the macroscopic traffic speed dynamics from these space-time visualizations, and demonstrate its application in the framework of traffic state estimation. Compared to existing estimation approaches, our approach allows a finer estimation resolution, eliminates the dependence on the initial conditions, and is agnostic to external factors such as traffic demand, road inhomogeneities and driving behaviors. Our model respects causality in traffic dynamics, which improves the robustness of estimation. We present the high-resolution traffic speed fields estimated for several freeway sections using the data obtained from the Next Generation Simulation Program (NGSIM) and German Highway (HighD) datasets. We further demonstrate the quality and utility of the estimation by inferring vehicle trajectories from the estimated speed fields, and discuss the benefits of deep neural network models in approximating the traffic dynamics.
Abstract:We propose a kinematic wave based Deep Convolutional Neural Network (Deep CNN) to estimate high resolution traffic speed dynamics from sparse probe vehicle trajectories. To that end, we introduce two key approaches that allow us to incorporate kinematic wave theory principles to improve the robustness of existing learning-based estimation methods. First, we use an anisotropic traffic-based kernel for the CNN. This kernel is designed to explicitly take forward and backward traffic wave propagation characteristics into account during reconstruction in the space-time domain. Second, we use simulated data for training the CNN. This implicitly imposes physical constraints on the patterns learned by the CNN, providing an alternate, unrestricted way to integrate complex traffic behaviors into learning models. We present the speed fields estimated using the anisotropic kernel and highlight its advantages over its isotropic counterpart in terms of predicting shockwave dynamics. Furthermore, we test the transferability of the trained model to real traffic by using two datasets: the Next Generation Simulation (NGSIM) program and the Highway Drone (HighD) dataset. Finally, we present an ensemble version of the CNN that allows us to handle multiple (and unknown) probe vehicle penetration rates. The results demonstrate that anisotropic kernels can reduce model complexity while improving the correctness of the estimation, and that simulation-based training is a viable alternative to model fitting using real-world data. This suggests that exploiting prior traffic knowledge adds value to learning-based estimation methods, and that there is great potential in exploring broader approaches to do so.