Abstract:This study develops FusionTransNet, a framework designed for Origin-Destination (OD) flow predictions within smart and multimodal urban transportation systems. Urban transportation complexity arises from the spatiotemporal interactions among various traffic modes. Motivated by analyzing multimodal data from Shenzhen, a framework that can dissect complicated spatiotemporal interactions between these modes, from the microscopic local level to the macroscopic city-wide perspective, is essential. The framework contains three core components: the Intra-modal Learning Module, the Inter-modal Learning Module, and the Prediction Decoder. The Intra-modal Learning Module is designed to analyze spatial dependencies within individual transportation modes, facilitating a granular understanding of single-mode spatiotemporal dynamics. The Inter-modal Learning Module extends this analysis, integrating data across different modes to uncover cross-modal interdependencies, by breaking down the interactions at both local and global scales. Finally, the Prediction Decoder synthesizes insights from the preceding modules to generate accurate OD flow predictions, translating complex multimodal interactions into forecasts. Empirical evaluations conducted in metropolitan contexts, including Shenzhen and New York, demonstrate FusionTransNet's superior predictive accuracy compared to existing state-of-the-art methods. The implication of this study extends beyond urban transportation, as the method for transferring information across different spatiotemporal graphs at both local and global scales can be instrumental in other spatial systems, such as supply chain logistics and epidemics spreading.
Abstract:Spatiotemporal (ST) learning has become a crucial technique to enable smart cities and sustainable urban development. Current ST learning models capture the heterogeneity via various spatial convolution and temporal evolution blocks. However, rapid urbanization leads to fluctuating distributions in urban data and city structures over short periods, resulting in existing methods suffering generalization and data adaptation issues. Despite efforts, existing methods fail to deal with newly arrived observations and those methods with generalization capacity are limited in repeated training. Motivated by complementary learning in neuroscience, we introduce a prompt-based complementary spatiotemporal learning termed ComS2T, to empower the evolution of models for data adaptation. ComS2T partitions the neural architecture into a stable neocortex for consolidating historical memory and a dynamic hippocampus for new knowledge update. We first disentangle two disjoint structures into stable and dynamic weights, and then train spatial and temporal prompts by characterizing distribution of main observations to enable prompts adaptive to new data. This data-adaptive prompt mechanism, combined with a two-stage training process, facilitates fine-tuning of the neural architecture conditioned on prompts, thereby enabling efficient adaptation during testing. Extensive experiments validate the efficacy of ComS2T in adapting to various spatiotemporal out-of-distribution scenarios while maintaining efficient inference capabilities.
Abstract:Spatiotemporal learning, which aims at extracting spatiotemporal correlations from the collected spatiotemporal data, is a research hotspot in recent years. And considering the inherent graph structure of spatiotemporal data, recent works focus on capturing spatial dependencies by utilizing Graph Convolutional Networks (GCNs) to aggregate vertex features with the guidance of adjacency matrices. In this paper, with extensive and deep-going experiments, we comprehensively analyze existing spatiotemporal graph learning models and reveal that extracting adjacency matrices with carefully design strategies, which are viewed as the key of enhancing performance on graph learning, are largely ineffective. Meanwhile, based on these experiments, we also discover that the aggregation itself is more important than the way that how vertices are aggregated. With these preliminary, a novel efficient Graph-Free Spatial (GFS) learning module based on layer normalization for capturing spatial correlations in spatiotemporal graph learning. The proposed GFS module can be easily plugged into existing models for replacing all graph convolution components. Rigorous theoretical proof demonstrates that the time complexity of GFS is significantly better than that of graph convolution operation. Extensive experiments verify the superiority of GFS in both the perspectives of efficiency and learning effect in processing graph-structured data especially extreme large scale graph data.
Abstract:Spatiotemporal forecasting is an imperative topic in data science due to its diverse and critical applications in smart cities. Existing works mostly perform consecutive predictions of following steps with observations completely and continuously obtained, where nearest observations can be exploited as key knowledge for instantaneous status estimation. However, the practical issues of early activity planning and sensor failures elicit a brand-new task, i.e., non-consecutive forecasting. In this paper, we define spatiotemporal learning systems with missing observation as Grey Spatiotemporal Systems (G2S) and propose a Factor-Decoupled learning framework for G2S (FDG2S), where the core idea is to hierarchically decouple multi-level factors and enable both flexible aggregations and disentangled uncertainty estimations. Firstly, to compensate for missing observations, a generic semantic-neighboring sequence sampling is devised, which selects representative sequences to capture both periodical regularity and instantaneous variations. Secondly, we turn the predictions of non-consecutive statuses into inferring statuses under expected combined exogenous factors. In particular, a factor-decoupled aggregation scheme is proposed to decouple factor-induced predictive intensity and region-wise proximity by two energy functions of conditional random field. To infer region-wise proximity under flexible factor-wise combinations and enable dynamic neighborhood aggregations, we further disentangle compounded influences of exogenous factors on region-wise proximity and learn to aggregate them. Given the inherent incompleteness and critical applications of G2S, a DisEntangled Uncertainty Quantification is put forward, to identify two types of uncertainty for reliability guarantees and model interpretations.