Abstract:Large Language Models (LLMs) have recently demonstrated significant potential in the field of time series forecasting, offering impressive capabilities in handling complex temporal data. However, their robustness and reliability in real-world applications remain under-explored, particularly concerning their susceptibility to adversarial attacks. In this paper, we introduce a targeted adversarial attack framework for LLM-based time series forecasting. By employing both gradient-free and black-box optimization methods, we generate minimal yet highly effective perturbations that significantly degrade the forecasting accuracy across multiple datasets and LLM architectures. Our experiments, which include models like TimeGPT and LLM-Time with GPT-3.5, GPT-4, LLaMa, and Mistral, show that adversarial attacks lead to much more severe performance degradation than random noise, and demonstrate the broad effectiveness of our attacks across different LLMs. The results underscore the critical vulnerabilities of LLMs in time series forecasting, highlighting the need for robust defense mechanisms to ensure their reliable deployment in practical applications.
Abstract:Real-world datasets often contain missing or corrupted values. Completing multidimensional tensor-structured data with missing entries is essential for numerous applications. Smoothness-constrained low-rank factorization models have shown superior performance with reduced computational costs. While effective at capturing global and long-range correlations, these models struggle to reproduce short-scale, high-frequency variations in the data. In this paper, we introduce the \Generalized Least Squares Kernelized Tensor Factorization (GLSKF) framework for tensor completion. GLSKF integrates smoothness-constrained low-rank factorization with a locally correlated residual process; the resulting additive structure can effectively characterize both global dependencies and local variations. In particular, we define the covariance norm to enforce the smoothness of factor matrices in the global low-rank factorization, and use structured covariance/kernel functions to model the local processes. For model estimation, we develop an alternating least squares (ALS) procedure with closed-form solutions for each subproblem. To efficiently handle missing data, GLSKF utilizes projection matrices that preserve the Kronecker structure of covariances, facilitating fast computations through conjugate gradient (CG) and preconditioned conjugate gradient (PCG) algorithms. The proposed framework is evaluated on four real-world datasets across diverse tasks: traffic speed imputation, color image inpainting, video completion, and MRI image reconstruction. Experimental results confirm that GLSKF delivers superior effectiveness and scalability, establishing it as a robust solution for multidimensional tensor completion.
Abstract:Accurate travel time estimation is essential for navigation and itinerary planning. While existing research employs probabilistic modeling to assess travel time uncertainty and account for correlations between multiple trips, modeling the temporal variability of multi-trip travel time distributions remains a significant challenge. Capturing the evolution of joint distributions requires large, well-organized datasets; however, real-world trip data are often temporally sparse and spatially unevenly distributed. To address this issue, we propose SPTTE, a spatiotemporal probabilistic framework that models the evolving joint distribution of multi-trip travel times by formulating the estimation task as a spatiotemporal stochastic process regression problem with fragmented observations. SPTTE incorporates an RNN-based temporal Gaussian process parameterization to regularize sparse observations and capture temporal dependencies. Additionally, it employs a prior-based heterogeneity smoothing strategy to correct unreliable learning caused by unevenly distributed trips, effectively modeling temporal variability under sparse and uneven data distributions. Evaluations on real-world datasets demonstrate that SPTTE outperforms state-of-the-art deterministic and probabilistic methods by over 10.13%. Ablation studies and visualizations further confirm the effectiveness of the model components.
Abstract:In probabilistic time series forecasting, the multivariate Gaussian (MVG) distribution is widely used as predictive distribution for correlated continuous random variables. Current deep probabilistic models typically employ neural networks to parameterize the mean vector and covariance matrix of the distribution, with log-score (i.e., negative log-likelihood) as the default loss function. However, log-score is highly sensitive to outliers, leading to significant errors when anomalies are present in the data. Motivated by the use of the continuous ranked probability score (CRPS) in learning univariate distributions, we propose a robust loss function specifically designed for high-dimensional MVG outputs. The proposed MVG-CRPS loss function has a closed-form expression based on the neural network outputs, making it easily integrable into deep learning models. We evaluate MVG-CRPS on two probabilistic forecasting tasks -- multivariate autoregressive and univariate sequence-to-sequence (Seq2Seq) forecasting -- both involving observations following MVG distribution. Experimental results on real-world datasets demonstrate that MVG-CRPS achieves both robustness and efficiency, offering enhanced accuracy and uncertainty quantification in probabilistic forecasting.
Abstract:Deep Generative Models (DGMs) have rapidly advanced in recent years, becoming essential tools in various fields due to their ability to learn complex data distributions and generate synthetic data. Their importance in transportation research is increasingly recognized, particularly for applications like traffic data generation, prediction, and feature extraction. This paper offers a comprehensive introduction and tutorial on DGMs, with a focus on their applications in transportation. It begins with an overview of generative models, followed by detailed explanations of fundamental models, a systematic review of the literature, and practical tutorial code to aid implementation. The paper also discusses current challenges and opportunities, highlighting how these models can be effectively utilized and further developed in transportation research. This paper serves as a valuable reference, guiding researchers and practitioners from foundational knowledge to advanced applications of DGMs in transportation research.
Abstract:Cooperative Adaptive Cruise Control (CACC) plays a pivotal role in enhancing traffic efficiency and safety in Connected and Autonomous Vehicles (CAVs). Reinforcement Learning (RL) has proven effective in optimizing complex decision-making processes in CACC, leading to improved system performance and adaptability. Among RL approaches, Multi-Agent Reinforcement Learning (MARL) has shown remarkable potential by enabling coordinated actions among multiple CAVs through Centralized Training with Decentralized Execution (CTDE). However, MARL often faces scalability issues, particularly when CACC vehicles suddenly join or leave the platoon, resulting in performance degradation. To address these challenges, we propose Communication-Aware Reinforcement Learning (CA-RL). CA-RL includes a communication-aware module that extracts and compresses vehicle communication information through forward and backward information transmission modules. This enables efficient cyclic information propagation within the CACC traffic flow, ensuring policy consistency and mitigating the scalability problems of MARL in CACC. Experimental results demonstrate that CA-RL significantly outperforms baseline methods in various traffic scenarios, achieving superior scalability, robustness, and overall system performance while maintaining reliable performance despite changes in the number of participating vehicles.
Abstract:Travel time estimation is a crucial application in navigation apps and web mapping services. Current deterministic and probabilistic methods primarily focus on modeling individual trips, assuming independence among trips. However, in real-world scenarios, we often observe strong inter-trip correlations due to factors such as weather conditions, traffic management, and road works. In this paper, we propose to model trip-level link travel time using a Gaussian hierarchical model, which can characterize both inter-trip and intra-trip correlations. The joint distribution of travel time of multiple trips becomes a multivariate Gaussian parameterized by learnable link representations. To effectively use the sparse GPS trajectories, we also propose a data augmentation method based on trip sub-sampling, which allows for fine-grained gradient backpropagation in learning link representations. During inference, we estimate the probability distribution of the travel time of a queried trip conditional on the completed trips that are spatiotemporally adjacent. We refer to the overall framework as ProbTTE. We evaluate ProbTTE on two real-world GPS trajectory datasets, and the results demonstrate its superior performance compared to state-of-the-art deterministic and probabilistic baselines. Additionally, we find that the learned link representations align well with the physical geometry of the network, making them suitable as input for other applications.
Abstract:Learning and understanding car-following (CF) behaviors are crucial for microscopic traffic simulation. Traditional CF models, though simple, often lack generalization capabilities, while many data-driven methods, despite their robustness, operate as "black boxes" with limited interpretability. To bridge this gap, this work introduces a Bayesian Matrix Normal Mixture Regression (MNMR) model that simultaneously captures feature correlations and temporal dynamics inherent in CF behaviors. This approach is distinguished by its separate learning of row and column covariance matrices within the model framework, offering an insightful perspective into the human driver decision-making processes. Through extensive experiments, we assess the model's performance across various historical steps of inputs, predictive steps of outputs, and model complexities. The results consistently demonstrate our model's adeptness in effectively capturing the intricate correlations and temporal dynamics present during CF. A focused case study further illustrates the model's outperforming interpretability of identifying distinct operational conditions through the learned mean and covariance matrices. This not only underlines our model's effectiveness in understanding complex human driving behaviors in CF scenarios but also highlights its potential as a tool for enhancing the interpretability of CF behaviors in traffic simulations and autonomous driving systems.
Abstract:Modeling the correlations among errors is closely associated with how accurately the model can quantify predictive uncertainty in probabilistic time series forecasting. Recent multivariate models have made significant progress in accounting for contemporaneous correlations among errors, while a common assumption on these errors is that they are temporally independent for the sake of statistical simplicity. However, real-world observations often deviate from this assumption, since errors usually exhibit substantial autocorrelation due to various factors such as the exclusion of temporally correlated covariates. In this work, we propose an efficient method, based on a low-rank-plus-diagonal parameterization of the covariance matrix, which can effectively characterize the autocorrelation of errors. The proposed method possesses several desirable properties: the complexity does not scale with the number of time series, the resulting covariance can be used for calibrating predictions, and it can seamlessly integrate with any model with Gaussian-distributed errors. We empirically demonstrate these properties using two distinct neural forecasting models-GPVar and Transformer. Our experimental results confirm the effectiveness of our method in enhancing predictive accuracy and the quality of uncertainty quantification on multiple real-world datasets.
Abstract:Long-term urban mobility predictions play a crucial role in the effective management of urban facilities and services. Conventionally, urban mobility data has been structured as spatiotemporal videos, treating longitude and latitude grids as fundamental pixels. Consequently, video prediction methods, relying on Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), have been instrumental in this domain. In our research, we introduce a fresh perspective on urban mobility prediction. Instead of oversimplifying urban mobility data as traditional video data, we regard it as a complex multivariate time series. This perspective involves treating the time-varying values of each grid in each channel as individual time series, necessitating a thorough examination of temporal dynamics, cross-variable correlations, and frequency-domain insights for precise and reliable predictions. To address this challenge, we present the Super-Multivariate Urban Mobility Transformer (SUMformer), which utilizes a specially designed attention mechanism to calculate temporal and cross-variable correlations and reduce computational costs stemming from a large number of time series. SUMformer also employs low-frequency filters to extract essential information for long-term predictions. Furthermore, SUMformer is structured with a temporal patch merge mechanism, forming a hierarchical framework that enables the capture of multi-scale correlations. Consequently, it excels in urban mobility pattern modeling and long-term prediction, outperforming current state-of-the-art methods across three real-world datasets.