Abstract:Unsupervised feature selection (UFS) has recently gained attention for its effectiveness in processing unlabeled high-dimensional data. However, existing methods overlook the intrinsic causal mechanisms within the data, resulting in the selection of irrelevant features and poor interpretability. Additionally, previous graph-based methods fail to account for the differing impacts of non-causal and causal features in constructing the similarity graph, which leads to false links in the generated graph. To address these issues, a novel UFS method, called Causally-Aware UnSupErvised Feature Selection learning (CAUSE-FS), is proposed. CAUSE-FS introduces a novel causal regularizer that reweights samples to balance the confounding distribution of each treatment feature. This regularizer is subsequently integrated into a generalized unsupervised spectral regression model to mitigate spurious associations between features and clustering labels, thus achieving causal feature selection. Furthermore, CAUSE-FS employs causality-guided hierarchical clustering to partition features with varying causal contributions into multiple granularities. By integrating similarity graphs learned adaptively at different granularities, CAUSE-FS increases the importance of causal features when constructing the fused similarity graph to capture the reliable local structure of data. Extensive experimental results demonstrate the superiority of CAUSE-FS over state-of-the-art methods, with its interpretability further validated through feature visualization.
Abstract:Accurate prediction of metro Origin-Destination (OD) flow is essential for the development of intelligent transportation systems and effective urban traffic management. Existing approaches typically either predict passenger outflow of departure stations or inflow of destination stations. However, we argue that travelers generally have clearly defined departure and arrival stations, making these OD pairs inherently interconnected. Consequently, considering OD pairs as a unified entity more accurately reflects actual metro travel patterns and allows for analyzing potential spatio-temporal correlations between different OD pairs. To address these challenges, we propose a novel and effective urban metro OD flow prediction method (UMOD), comprising three core modules: a data embedding module, a temporal relation module, and a spatial relation module. The data embedding module projects raw OD pair inputs into hidden space representations, which are subsequently processed by the temporal and spatial relation modules to capture both inter-pair and intra-pair spatio-temporal dependencies. Experimental results on two real-world urban metro OD flow datasets demonstrate that adopting the OD pairs perspective is critical for accurate metro OD flow prediction. Our method outperforms existing approaches, delivering superior predictive performance.
Abstract:Learning temporal dependencies among targets (TDT) benefits better time series forecasting, where targets refer to the predicted sequence. Although autoregressive methods model TDT recursively, they suffer from inefficient inference and error accumulation. We argue that integrating TDT learning into non-autoregressive methods is essential for pursuing effective and efficient time series forecasting. In this study, we introduce the differencing approach to represent TDT and propose a parameter-free and plug-and-play solution through an optimization objective, namely TDT Loss. It leverages the proportion of inconsistent signs between predicted and ground truth TDT as an adaptive weight, dynamically balancing target prediction and fine-grained TDT fitting. Importantly, TDT Loss incurs negligible additional cost, with only $\mathcal{O}(n)$ increased computation and $\mathcal{O}(1)$ memory requirements, while significantly enhancing the predictive performance of non-autoregressive models. To assess the effectiveness of TDT loss, we conduct extensive experiments on 7 widely used datasets. The experimental results of plugging TDT loss into 6 state-of-the-art methods show that out of the 168 experiments, 75.00\% and 94.05\% exhibit improvements in terms of MSE and MAE with the maximum 24.56\% and 16.31\%, respectively.
Abstract:Spatio-temporal forecasting of future values of spatially correlated time series is important across many cyber-physical systems (CPS). Recent studies offer evidence that the use of graph neural networks to capture latent correlations between time series holds a potential for enhanced forecasting. However, most existing methods rely on pre-defined or self-learning graphs, which are either static or unintentionally dynamic, and thus cannot model the time-varying correlations that exhibit trends and periodicities caused by the regularity of the underlying processes in CPS. To tackle such limitation, we propose Time-aware Graph Structure Learning (TagSL), which extracts time-aware correlations among time series by measuring the interaction of node and time representations in high-dimensional spaces. Notably, we introduce time discrepancy learning that utilizes contrastive learning with distance-based regularization terms to constrain learned spatial correlations to a trend sequence. Additionally, we propose a periodic discriminant function to enable the capture of periodic changes from the state of nodes. Next, we present a Graph Convolution-based Gated Recurrent Unit (GCGRU) that jointly captures spatial and temporal dependencies while learning time-aware and node-specific patterns. Finally, we introduce a unified framework named Time-aware Graph Convolutional Recurrent Network (TGCRN), combining TagSL, and GCGRU in an encoder-decoder architecture for multi-step spatio-temporal forecasting. We report on experiments with TGCRN and popular existing approaches on five real-world datasets, thus providing evidence that TGCRN is capable of advancing the state-of-the-art. We also cover a detailed ablation study and visualization analysis, offering detailed insight into the effectiveness of time-aware structure learning.
Abstract:Urban metro flow prediction is of great value for metro operation scheduling, passenger flow management and personal travel planning. However, it faces two main challenges. First, different metro stations, e.g. transfer stations and non-transfer stations, have unique traffic patterns. Second, it is challenging to model complex spatio-temporal dynamic relation of metro stations. To address these challenges, we develop a spatio-temporal dynamic graph relational learning model (STDGRL) to predict urban metro station flow. First, we propose a spatio-temporal node embedding representation module to capture the traffic patterns of different stations. Second, we employ a dynamic graph relationship learning module to learn dynamic spatial relationships between metro stations without a predefined graph adjacency matrix. Finally, we provide a transformer-based long-term relationship prediction module for long-term metro flow prediction. Extensive experiments are conducted based on metro data in Beijing, Shanghai, Chongqing and Hangzhou. Experimental results show the advantages of our method beyond 11 baselines for urban metro flow prediction.
Abstract:Weather Forecasting is an attractive challengeable task due to its influence on human life and complexity in atmospheric motion. Supported by massive historical observed time series data, the task is suitable for data-driven approaches, especially deep neural networks. Recently, the Graph Neural Networks (GNNs) based methods have achieved excellent performance for spatio-temporal forecasting. However, the canonical GNNs-based methods only individually model the local graph of meteorological variables per station or the global graph of whole stations, lacking information interaction between meteorological variables in different stations. In this paper, we propose a novel Hierarchical Spatio-Temporal Graph Neural Network (HiSTGNN) to model cross-regional spatio-temporal correlations among meteorological variables in multiple stations. An adaptive graph learning layer and spatial graph convolution are employed to construct self-learning graph and study hidden dependency among nodes of variable-level and station-level graph. For capturing temporal pattern, the dilated inception as the backbone of gate temporal convolution is designed to model long and various meteorological trends. Moreover, a dynamic interaction learning is proposed to build bidirectional information passing in hierarchical graph. Experimental results on three real-world meteorological datasets demonstrate the superior performance of HiSTGNN beyond 7 baselines and it reduces the errors by 4.2% to 11.6% especially compared to state-of-the-art weather forecasting method.