Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ling Zhao

Causal invariant geographic network representations with feature and structural distribution shifts

Mar 25, 2025

Yuhan Wang, Silu He, Qinyao Luo, Hongyuan Yuan, Ling Zhao, Jiawei Zhu, Haifeng Li

Abstract:The existing methods learn geographic network representations through deep graph neural networks (GNNs) based on the i.i.d. assumption. However, the spatial heterogeneity and temporal dynamics of geographic data make the out-of-distribution (OOD) generalisation problem particularly salient. The latter are particularly sensitive to distribution shifts (feature and structural shifts) between testing and training data and are the main causes of the OOD generalisation problem. Spurious correlations are present between invariant and background representations due to selection biases and environmental effects, resulting in the model extremes being more likely to learn background representations. The existing approaches focus on background representation changes that are determined by shifts in the feature distributions of nodes in the training and test data while ignoring changes in the proportional distributions of heterogeneous and homogeneous neighbour nodes, which we refer to as structural distribution shifts. We propose a feature-structure mixed invariant representation learning (FSM-IRL) model that accounts for both feature distribution shifts and structural distribution shifts. To address structural distribution shifts, we introduce a sampling method based on causal attention, encouraging the model to identify nodes possessing strong causal relationships with labels or nodes that are more similar to the target node. Inspired by the Hilbert-Schmidt independence criterion, we implement a reweighting strategy to maximise the orthogonality of the node representations, thereby mitigating the spurious correlations among the node representations and suppressing the learning of background representations. Our experiments demonstrate that FSM-IRL exhibits strong learning capabilities on both geographic and social network datasets in OOD scenarios.

* Future Generation Computer Systems 2025
* 15 pages, 3 figures, 8 tables

Via

Access Paper or Ask Questions

Dexterous Hand Manipulation via Efficient Imitation-Bootstrapped Online Reinforcement Learning

Mar 06, 2025

Dongchi Huang, Tianle Zhang, Yihang Li, Ling Zhao, Jiayi Li, Zhirui Fang, Chunhe Xia, Lusong Li, Xiaodong He

Abstract:Dexterous hand manipulation in real-world scenarios presents considerable challenges due to its demands for both dexterity and precision. While imitation learning approaches have thoroughly examined these challenges, they still require a significant number of expert demonstrations and are limited by a constrained performance upper bound. In this paper, we propose a novel and efficient Imitation-Bootstrapped Online Reinforcement Learning (IBORL) method tailored for robotic dexterous hand manipulation in real-world environments. Specifically, we pretrain the policy using a limited set of expert demonstrations and subsequently finetune this policy through direct reinforcement learning in the real world. To address the catastrophic forgetting issues that arise from the distribution shift between expert demonstrations and real-world environments, we design a regularization term that balances the exploration of novel behaviors with the preservation of the pretrained policy. Our experiments with real-world tasks demonstrate that our method significantly outperforms existing approaches, achieving an almost 100% success rate and a 23% improvement in cycle time. Furthermore, by finetuning with online reinforcement learning, our method surpasses expert demonstrations and uncovers superior policies. Our code and empirical results are available in https://hggforget.github.io/iborl.github.io/.

Via

Access Paper or Ask Questions

SeFi-CD: A Semantic First Change Detection Paradigm That Can Detect Any Change You Want

Jul 13, 2024

Ling Zhao, Zhenyang Huang, Dongsheng Kuang, Chengli Peng, Jun Gan, Haifeng Li

Abstract:The existing change detection(CD) methods can be summarized as the visual-first change detection (ViFi-CD) paradigm, which first extracts change features from visual differences and then assigns them specific semantic information. However, CD is essentially dependent on change regions of interest (CRoIs), meaning that the CD results are directly determined by the semantics changes of interest, making its primary image factor semantic of interest rather than visual. The ViFi-CD paradigm can only assign specific semantics of interest to specific change features extracted from visual differences, leading to the inevitable omission of potential CRoIs and the inability to adapt to different CRoI CD tasks. In other words, changes in other CRoIs cannot be detected by the ViFi-CD method without retraining the model or significantly modifying the method. This paper introduces a new CD paradigm, the semantic-first CD (SeFi-CD) paradigm. The core idea of SeFi-CD is to first perceive the dynamic semantics of interest and then visually search for change features related to the semantics. Based on the SeFi-CD paradigm, we designed Anything You Want Change Detection (AUWCD). Experiments on public datasets demonstrate that the AUWCD outperforms the current state-of-the-art CD methods, achieving an average F1 score 5.01\% higher than that of these advanced supervised baselines on the SECOND dataset, with a maximum increase of 13.17\%. The proposed SeFi-CD offers a novel CD perspective and approach.

Via

Access Paper or Ask Questions

Improving Rehabilitative Assessment with Statistical and Shape Preserving Surrogate Data and Singular Spectrum Analysis

Jun 22, 2024

T. K. M. Lee, H. W. Chan, K. H. Leo, E. Chew, Ling Zhao, S. Sanei

Figure 1 for Improving Rehabilitative Assessment with Statistical and Shape Preserving Surrogate Data and Singular Spectrum Analysis

Figure 2 for Improving Rehabilitative Assessment with Statistical and Shape Preserving Surrogate Data and Singular Spectrum Analysis

Figure 3 for Improving Rehabilitative Assessment with Statistical and Shape Preserving Surrogate Data and Singular Spectrum Analysis

Figure 4 for Improving Rehabilitative Assessment with Statistical and Shape Preserving Surrogate Data and Singular Spectrum Analysis

Abstract:Time series data are collected in temporal order and are widely used to train systems for prediction, modeling and classification to name a few. These systems require large amounts of data to improve generalization and prevent over-fitting. However there is a comparative lack of time series data due to operational constraints. This situation is alleviated by synthesizing data which have a suitable spread of features yet retain the distinctive features of the original data. These would be its basic statistical properties and overall shape which are important for short time series such as in rehabilitative applications or in quickly changing portions of lengthy data. In our earlier work synthesized surrogate time series were used to augment rehabilitative data. This gave good results in classification but the resulting waveforms did not preserve the original signal shape. To remedy this, we use singular spectrum analysis (SSA) to separate a signal into trends and cycles to describe the shape of the signal and low level components. In a novel way we subject the low level component to randomizing processes then recombine this with the original trend and cycle components to form a synthetic time series. We compare our approach with other methods, using statistical and shape measures and demonstrate its effectiveness in classification.

* 2022 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), Poznan, Poland, 2022, pp. 58-63
* This version of the paper under the same title, acknowledges the data source and the funding for current research using this data. arXiv admin note: substantial text overlap with arXiv:2404.14211

Via

Access Paper or Ask Questions

RS-GPT4V: A Unified Multimodal Instruction-Following Dataset for Remote Sensing Image Understanding

Jun 18, 2024

Linrui Xu, Ling Zhao, Wang Guo, Qiujun Li, Kewang Long, Kaiqi Zou, Yuhan Wang, Haifeng Li

Abstract:The remote sensing image intelligence understanding model is undergoing a new profound paradigm shift which has been promoted by multi-modal large language model (MLLM), i.e. from the paradigm learning a domain model (LaDM) shifts to paradigm learning a pre-trained general foundation model followed by an adaptive domain model (LaGD). Under the new LaGD paradigm, the old datasets, which have led to advances in RSI intelligence understanding in the last decade, are no longer suitable for fire-new tasks. We argued that a new dataset must be designed to lighten tasks with the following features: 1) Generalization: training model to learn shared knowledge among tasks and to adapt to different tasks; 2) Understanding complex scenes: training model to understand the fine-grained attribute of the objects of interest, and to be able to describe the scene with natural language; 3) Reasoning: training model to be able to realize high-level visual reasoning. In this paper, we designed a high-quality, diversified, and unified multimodal instruction-following dataset for RSI understanding produced by GPT-4V and existing datasets, which we called RS-GPT4V. To achieve generalization, we used a (Question, Answer) which was deduced from GPT-4V via instruction-following to unify the tasks such as captioning and localization; To achieve complex scene, we proposed a hierarchical instruction description with local strategy in which the fine-grained attributes of the objects and their spatial relationships are described and global strategy in which all the local information are integrated to yield detailed instruction descript; To achieve reasoning, we designed multiple-turn QA pair to provide the reasoning ability for a model. The empirical results show that the fine-tuned MLLMs by RS-GPT4V can describe fine-grained information. The dataset is available at: https://github.com/GeoX-Lab/RS-GPT4V.

* 14 pages, 6 figures, 4 tables

Via

Access Paper or Ask Questions

CAT: A Causally Graph Attention Network for Trimming Heterophilic Graph

Dec 15, 2023

Silu He, Qinyao Luo, Xinsha Fu, Ling Zhao, Ronghua Du, Haifeng Li

Abstract:Local Attention-guided Message Passing Mechanism (LAMP) adopted in Graph Attention Networks (GATs) is designed to adaptively learn the importance of neighboring nodes for better local aggregation on the graph, which can bring the representations of similar neighbors closer effectively, thus showing stronger discrimination ability. However, existing GATs suffer from a significant discrimination ability decline in heterophilic graphs because the high proportion of dissimilar neighbors can weaken the self-attention of the central node, jointly resulting in the deviation of the central node from similar nodes in the representation space. This kind of effect generated by neighboring nodes is called the Distraction Effect (DE) in this paper. To estimate and weaken the DE of neighboring nodes, we propose a Causally graph Attention network for Trimming heterophilic graph (CAT). To estimate the DE, since the DE are generated through two paths (grab the attention assigned to neighbors and reduce the self-attention of the central node), we use Total Effect to model DE, which is a kind of causal estimand and can be estimated from intervened data; To weaken the DE, we identify the neighbors with the highest DE (we call them Distraction Neighbors) and remove them. We adopt three representative GATs as the base model within the proposed CAT framework and conduct experiments on seven heterophilic datasets in three different sizes. Comparative experiments show that CAT can improve the node classification accuracy of all base GAT models. Ablation experiments and visualization further validate the enhancement of discrimination ability brought by CAT. The source code is available at https://github.com/GeoX-Lab/CAT.

* Information Science 2023 sumitted
* 24 pages, 17 figures, 4 tables

Via

Access Paper or Ask Questions

Deep Double Self-Expressive Subspace Clustering

Jun 20, 2023

Ling Zhao, Yunpeng Ma, Shanxiong Chen, Jun Zhou

Abstract:Deep subspace clustering based on auto-encoder has received wide attention. However, most subspace clustering based on auto-encoder does not utilize the structural information in the self-expressive coefficient matrix, which limits the clustering performance. In this paper, we propose a double self-expressive subspace clustering algorithm. The key idea of our solution is to view the self-expressive coefficient as a feature representation of the example to get another coefficient matrix. Then, we use the two coefficient matrices to construct the affinity matrix for spectral clustering. We find that it can reduce the subspace-preserving representation error and improve connectivity. To further enhance the clustering performance, we proposed a self-supervised module based on contrastive learning, which can further improve the performance of the trained network. Experiments on several benchmark datasets demonstrate that the proposed algorithm can achieve better clustering than state-of-the-art methods.

* 5 pagees,4 figures,ICASSP2023, revised version

Via

Access Paper or Ask Questions

STGC-GNNs: A GNN-based traffic prediction framework with a spatial-temporal Granger causality graph

Oct 30, 2022

Silu He, Qinyao Luo, Ronghua Du, Ling Zhao, Haifeng Li

Figure 1 for STGC-GNNs: A GNN-based traffic prediction framework with a spatial-temporal Granger causality graph

Figure 2 for STGC-GNNs: A GNN-based traffic prediction framework with a spatial-temporal Granger causality graph

Figure 3 for STGC-GNNs: A GNN-based traffic prediction framework with a spatial-temporal Granger causality graph

Figure 4 for STGC-GNNs: A GNN-based traffic prediction framework with a spatial-temporal Granger causality graph

Abstract:The key to traffic prediction is to accurately depict the temporal dynamics of traffic flow traveling in a road network, so it is important to model the spatial dependence of the road network. The essence of spatial dependence is to accurately describe how traffic information transmission is affected by other nodes in the road network, and the GNN-based traffic prediction model, as a benchmark for traffic prediction, has become the most common method for the ability to model spatial dependence by transmitting traffic information with the message passing mechanism. However, existing methods model a local and static spatial dependence, which cannot transmit the global-dynamic traffic information (GDTi) required for long-term prediction. The challenge is the difficulty of detecting the precise transmission of GDTi due to the uncertainty of individual transport, especially for long-term transmission. In this paper, we propose a new hypothesis\: GDTi behaves macroscopically as a transmitting causal relationship (TCR) underlying traffic flow, which remains stable under dynamic changing traffic flow. We further propose spatial-temporal Granger causality (STGC) to express TCR, which models global and dynamic spatial dependence. To model global transmission, we model the causal order and causal lag of TCRs global transmission by a spatial-temporal alignment algorithm. To capture dynamic spatial dependence, we approximate the stable TCR underlying dynamic traffic flow by a Granger causality test. The experimental results on three backbone models show that using STGC to model the spatial dependence has better results than the original model for 45 min and 1 h long-term prediction.

* 14 pages, 16 figures, 4 tables

Via

Access Paper or Ask Questions

KST-GCN: A Knowledge-Driven Spatial-Temporal Graph Convolutional Network for Traffic Forecasting

Nov 26, 2020

Jiawei Zhu, Xin Han, Hanhan Deng, Chao Tao, Ling Zhao, Lin Tao, Haifeng Li

Figure 1 for KST-GCN: A Knowledge-Driven Spatial-Temporal Graph Convolutional Network for Traffic Forecasting

Figure 2 for KST-GCN: A Knowledge-Driven Spatial-Temporal Graph Convolutional Network for Traffic Forecasting

Figure 3 for KST-GCN: A Knowledge-Driven Spatial-Temporal Graph Convolutional Network for Traffic Forecasting

Figure 4 for KST-GCN: A Knowledge-Driven Spatial-Temporal Graph Convolutional Network for Traffic Forecasting

Abstract:When considering the spatial and temporal features of traffic, capturing the impacts of various external factors on travel is an important step towards achieving accurate traffic forecasting. The impacts of external factors on the traffic flow have complex correlations. However, existing studies seldom consider external factors or neglecting the effect of the complex correlations among external factors on traffic. Intuitively, knowledge graphs can naturally describe these correlations, but knowledge graphs and traffic networks are essentially heterogeneous networks; thus, it is a challenging problem to integrate the information in both networks. We propose a knowledge representation-driven traffic forecasting method based on spatiotemporal graph convolutional networks. We first construct a city knowledge graph for traffic forecasting, then use KS-Cells to combine the information from the knowledge graph and the traffic network, and finally, capture the temporal changes of the traffic state with GRU. Testing on real-world datasets shows that the KST-GCN has higher accuracy than the baseline traffic forecasting methods at various prediction horizons. We provide a new way to integrate knowledge and the spatiotemporal features of data for traffic forecasting tasks. Without any loss of generality, the proposed method can also be extended to other spatiotemporal forecasting tasks.

* 11 pages, 17 figures, 3 tables

Via

Access Paper or Ask Questions

AST-GCN: Attribute-Augmented Spatiotemporal Graph Convolutional Network for Traffic Forecasting

Nov 22, 2020

Jiawei Zhu, Chao Tao, Hanhan Deng, Ling Zhao, Pu Wang, Tao Lin, Haifeng Li

Figure 1 for AST-GCN: Attribute-Augmented Spatiotemporal Graph Convolutional Network for Traffic Forecasting

Figure 2 for AST-GCN: Attribute-Augmented Spatiotemporal Graph Convolutional Network for Traffic Forecasting

Figure 3 for AST-GCN: Attribute-Augmented Spatiotemporal Graph Convolutional Network for Traffic Forecasting

Figure 4 for AST-GCN: Attribute-Augmented Spatiotemporal Graph Convolutional Network for Traffic Forecasting

Abstract:Traffic forecasting is a fundamental and challenging task in the field of intelligent transportation. Accurate forecasting not only depends on the historical traffic flow information but also needs to consider the influence of a variety of external factors, such as weather conditions and surrounding POI distribution. Recently, spatiotemporal models integrating graph convolutional networks and recurrent neural networks have become traffic forecasting research hotspots and have made significant progress. However, few works integrate external factors. Therefore, based on the assumption that introducing external factors can enhance the spatiotemporal accuracy in predicting traffic and improving interpretability, we propose an attribute-augmented spatiotemporal graph convolutional network (AST-GCN). We model the external factors as dynamic attributes and static attributes and design an attribute-augmented unit to encode and integrate those factors into the spatiotemporal graph convolution model. Experiments on real datasets show the effectiveness of considering external information on traffic forecasting tasks when compared to traditional traffic prediction methods. Moreover, under different attribute-augmented schemes and prediction horizon settings, the forecasting accuracy of the AST-GCN is higher than that of the baselines.

* 11 pages, 17 figures, 3 tables

Via

Access Paper or Ask Questions