Abstract:Recent studies reveal that Autonomous Vehicles (AVs) can be manipulated by hidden backdoors, causing them to perform harmful actions when activated by physical triggers. However, it is still unclear how these triggers can be activated while adhering to traffic principles. Understanding this vulnerability in a dynamic traffic environment is crucial. This work addresses this gap by presenting physical trigger activation as a reachability problem of controlled dynamic system. Our technique identifies security-critical areas in traffic systems where trigger conditions for accidents can be reached, and provides intended trajectories for how those conditions can be reached. Testing on typical traffic scenarios showed the system can be successfully driven to trigger conditions with near 100% activation rate. Our method benefits from identifying AV vulnerability and enabling effective safety strategies.
Abstract:Backdoor attacks impose a new threat in Deep Neural Networks (DNNs), where a backdoor is inserted into the neural network by poisoning the training dataset, misclassifying inputs that contain the adversary trigger. The major challenge for defending against these attacks is that only the attacker knows the secret trigger and the target class. The problem is further exacerbated by the recent introduction of "Hidden Triggers", where the triggers are carefully fused into the input, bypassing detection by human inspection and causing backdoor identification through anomaly detection to fail. To defend against such imperceptible attacks, in this work we systematically analyze how representations, i.e., the set of neuron activations for a given DNN when using the training data as inputs, are affected by backdoor attacks. We propose PiDAn, an algorithm based on coherence optimization purifying the poisoned data. Our analysis shows that representations of poisoned data and authentic data in the target class are still embedded in different linear subspaces, which implies that they show different coherence with some latent spaces. Based on this observation, the proposed PiDAn algorithm learns a sample-wise weight vector to maximize the projected coherence of weighted samples, where we demonstrate that the learned weight vector has a natural "grouping effect" and is distinguishable between authentic data and poisoned data. This enables the systematic detection and mitigation of backdoor attacks. Based on our theoretical analysis and experimental results, we demonstrate the effectiveness of PiDAn in defending against backdoor attacks that use different settings of poisoned samples on GTSRB and ILSVRC2012 datasets. Our PiDAn algorithm can detect more than 90% infected classes and identify 95% poisoned samples.
Abstract:This paper develops a data-driven toolkit for traffic forecasting using high-resolution (a.k.a. event-based) traffic data. This is the raw data obtained from fixed sensors in urban roads. Time series of such raw data exhibit heavy fluctuations from one time step to the next (typically on the order of 0.1-1 second). Short-term forecasts (10-30 seconds into the future) of traffic conditions are critical for traffic operations applications (e.g., adaptive signal control). But traffic forecasting tools in the literature deal predominantly with 3-5 minute aggregated data, where the typical signal cycle is on the order of 2 minutes. This renders such forecasts useless at the operations level. To this end, we model the traffic forecasting problem as a matrix completion problem, where the forecasting inputs are mapped to a higher dimensional space using kernels. The formulation allows us to capture both nonlinear dependencies between forecasting inputs and outputs but also allows us to capture dependencies among the inputs. These dependencies correspond to correlations between different locations in the network. We further employ adaptive boosting to enhance the training accuracy and capture historical patterns in the data. The performance of the proposed methods is verified using high-resolution data obtained from a real-world traffic network in Abu Dhabi, UAE. Our experimental results show that the proposed method outperforms other state-of-the-art algorithms.
Abstract:We focus on short-term traffic forecasting for traffic operations management. Specifically, we focus on forecasting traffic network sensor states in high-resolution (second-by-second). Most work on traffic forecasting has focused on predicting aggregated traffic variables, typically over intervals that are no shorter than 5 minutes. The data resolution required for traffic operations is challenging since high-resolution data exhibit heavier oscillations and precise patterns are harder to capture. We propose a (big) data-driven methodology for this purpose. Our contributions can be summarized as offering three major insights: first, we show how the forecasting problem can be modeled as a matrix completion problem. Second, we employ a block-coordinate descent algorithm and demonstrate that the algorithm converges in sub-linear time to a block coordinate-wise optimizer. This allows us to capitalize on the "bigness" of high-resolution data in a computationally feasible way. Third, we develop an adaptive boosting (or ensemble learning) approach to reduce the training error to within any arbitrary error threshold. The latter utilizes past days so that the boosting can be interpreted as capturing periodic patterns in the data. The performance of the proposed method is analyzed theoretically and tested empirically using a real-world high-resolution traffic dataset from Abu Dhabi, UAE. Our experimental results show that the proposed method outperforms other state-of-the-art algorithms.
Abstract:For large-scale industrial processes under closed-loop control, process dynamics directly resulting from control action are typical characteristics and may show different behaviors between real faults and normal changes of operating conditions. However, conventional distributed monitoring approaches do not consider the closed-loop control mechanism and only explore static characteristics, which thus are incapable of distinguishing between real process faults and nominal changes of operating conditions, leading to unnecessary alarms. In this regard, this paper proposes a distributed monitoring method for closed-loop industrial processes by concurrently exploring static and dynamic characteristics. First, the large-scale closed-loop process is decomposed into several subsystems by developing a sparse slow feature analysis (SSFA) algorithm which capture changes of both static and dynamic information. Second, distributed models are developed to separately capture static and dynamic characteristics from the local and global aspects. Based on the distributed monitoring system, a two-level monitoring strategy is proposed to check different influences on process characteristics resulting from changes of the operating conditions and control action, and thus the two changes can be well distinguished from each other. Case studies are conducted based on both benchmark data and real industrial process data to illustrate the effectiveness of the proposed method.