Abstract:Spatial-temporal forecasting has various applications in transportation, climate, and human activity domains. Current spatial-temporal forecasting models primarily adopt a macro perspective, focusing on achieving strong overall prediction performance for the entire system. However, most of these models overlook the importance of enhancing the uniformity of prediction performance across different nodes, leading to poor prediction capabilities for certain nodes and rendering some results impractical. This task is particularly challenging due to the inherent heterogeneity of spatial-temporal data. To address this issue, in this paper, we propose a novel Heterogeneity-informed Mixture-of-Experts (HiMoE) for fair spatial-temporal forecasting. Specifically, we design a Heterogeneity-Informed Graph Convolutional Network (HiGCN), integrated into each expert model to enhance the flexibility of the experts. To adapt to the heterogeneity of spatial-temporal data, we design a Node-wise Mixture-of-Experts (NMoE). This model decouples the spatial-temporal prediction task into sub-tasks at the spatial scale, which are then assigned to different experts. To allocate these sub-tasks, we use a mean-based graph decoupling method to distinguish the graph structure for each expert. The results are then aggregated using an output gating mechanism based on a dense Mixture-of-Experts (dMoE). Additionally, fairness-aware loss and evaluation functions are proposed to train the model with uniformity and accuracy as objectives. Experiments conducted on four datasets, encompassing diverse data types and spatial scopes, validate HiMoE's ability to scale across various real-world scenarios. Furthermore, HiMoE consistently outperforms baseline models, achieving superior performance in both accuracy and uniformity.
Abstract:Multimodal traffic flow can reflect the health of the transportation system, and its prediction is crucial to urban traffic management. Recent works overemphasize spatio-temporal correlations of traffic flow, ignoring the physical concepts that lead to the generation of observations and their causal relationship. Spatio-temporal correlations are considered unstable under the influence of different conditions, and spurious correlations may exist in observations. In this paper, we analyze the physical concepts affecting the generation of multimode traffic flow from the perspective of the observation generation principle and propose a Causal Conditional Hidden Markov Model (CCHMM) to predict multimodal traffic flow. In the latent variables inference stage, a posterior network disentangles the causal representations of the concepts of interest from conditional information and observations, and a causal propagation module mines their causal relationship. In the data generation stage, a prior network samples the causal latent variables from the prior distribution and feeds them into the generator to generate multimodal traffic flow. We use a mutually supervised training method for the prior and posterior to enhance the identifiability of the model. Experiments on real-world datasets show that CCHMM can effectively disentangle causal representations of concepts of interest and identify causality, and accurately predict multimodal traffic flow.
Abstract:As a representative of public transportation, the fundamental issue of managing bike-sharing systems is bike flow prediction. Recent methods overemphasize the spatio-temporal correlations in the data, ignoring the effects of contextual conditions on the transportation system and the inter-regional timevarying causality. In addition, due to the disturbance of incomplete observations in the data, random contextual conditions lead to spurious correlations between data and features, making the prediction of the model ineffective in special scenarios. To overcome this issue, we propose a Spatio-temporal Neural Structure Causal Model(STNSCM) from the perspective of causality. First, we build a causal graph to describe the traffic prediction, and further analyze the causal relationship between the input data, contextual conditions, spatiotemporal states, and prediction results. Second, we propose to apply the frontdoor criterion to eliminate confounding biases in the feature extraction process. Finally, we propose a counterfactual representation reasoning module to extrapolate the spatio-temporal state under the factual scenario to future counterfactual scenarios to improve the prediction performance. Experiments on real-world datasets demonstrate the superior performance of our model, especially its resistance to fluctuations caused by the external environment. The source code and data will be released.