Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhihui Zhang

MEGA: Second-Order Gradient Alignment for Catastrophic Forgetting Mitigation in GFSCIL

Apr 18, 2025

Jinhui Pang, Changqing Lin, Hao Lin, Jinglin He, Zhengjun Li, Zhihui Zhang, Xiaoshuai Hao

Abstract:Graph Few-Shot Class-Incremental Learning (GFSCIL) enables models to continually learn from limited samples of novel tasks after initial training on a large base dataset. Existing GFSCIL approaches typically utilize Prototypical Networks (PNs) for metric-based class representations and fine-tune the model during the incremental learning stage. However, these PN-based methods oversimplify learning via novel query set fine-tuning and fail to integrate Graph Continual Learning (GCL) techniques due to architectural constraints. To address these challenges, we propose a more rigorous and practical setting for GFSCIL that excludes query sets during the incremental training phase. Building on this foundation, we introduce Model-Agnostic Meta Graph Continual Learning (MEGA), aimed at effectively alleviating catastrophic forgetting for GFSCIL. Specifically, by calculating the incremental second-order gradient during the meta-training stage, we endow the model to learn high-quality priors that enhance incremental learning by aligning its behaviors across both the meta-training and incremental learning stages. Extensive experiments on four mainstream graph datasets demonstrate that MEGA achieves state-of-the-art results and enhances the effectiveness of various GCL methods in GFSCIL. We believe that our proposed MEGA serves as a model-agnostic GFSCIL paradigm, paving the way for future research.

* Under Review

Via

Access Paper or Ask Questions

DWCL: Dual-Weighted Contrastive Learning for Multi-View Clustering

Nov 26, 2024

Zhihui Zhang, Xiaoshuai Hao, Hanning Yuan, Lianhua Chi, Qi Guo, Qi Li, Ziqiang Yuan, Jinhui Pang, Yexin Li, Sijie Ruan

Figure 1 for DWCL: Dual-Weighted Contrastive Learning for Multi-View Clustering

Figure 2 for DWCL: Dual-Weighted Contrastive Learning for Multi-View Clustering

Figure 3 for DWCL: Dual-Weighted Contrastive Learning for Multi-View Clustering

Figure 4 for DWCL: Dual-Weighted Contrastive Learning for Multi-View Clustering

Abstract:Multi-view contrastive clustering (MVCC) has gained significant attention for generating consistent clustering structures from multiple views through contrastive learning. However, most existing MVCC methods create cross-views by combining any two views, leading to a high volume of unreliable pairs. Furthermore, these approaches often overlook discrepancies in multi-view representations, resulting in representation degeneration. To address these challenges, we introduce a novel model called Dual-Weighted Contrastive Learning (DWCL) for Multi-View Clustering. Specifically, to reduce the impact of unreliable cross-views, we introduce an innovative Best-Other (B-O) contrastive mechanism that enhances the representation of individual views at a low computational cost. Furthermore, we develop a dual weighting strategy that combines a view quality weight, reflecting the quality of each view, with a view discrepancy weight. This approach effectively mitigates representation degeneration by downplaying cross-views that are both low in quality and high in discrepancy. We theoretically validate the efficiency of the B-O contrastive mechanism and the effectiveness of the dual weighting strategy. Extensive experiments demonstrate that DWCL outperforms previous methods across eight multi-view datasets, showcasing superior performance and robustness in MVCC. Specifically, our method achieves absolute accuracy improvements of 5.4\% and 5.6\% compared to state-of-the-art methods on the Caltech6V7 and MSRCv1 datasets, respectively.

Via

Access Paper or Ask Questions

ESC-MISR: Enhancing Spatial Correlations for Multi-Image Super-Resolution in Remote Sensing

Nov 07, 2024

Zhihui Zhang, Jinhui Pang, Jianan Li, Xiaoshuai Hao

Figure 1 for ESC-MISR: Enhancing Spatial Correlations for Multi-Image Super-Resolution in Remote Sensing

Figure 2 for ESC-MISR: Enhancing Spatial Correlations for Multi-Image Super-Resolution in Remote Sensing

Figure 3 for ESC-MISR: Enhancing Spatial Correlations for Multi-Image Super-Resolution in Remote Sensing

Figure 4 for ESC-MISR: Enhancing Spatial Correlations for Multi-Image Super-Resolution in Remote Sensing

Abstract:Multi-Image Super-Resolution (MISR) is a crucial yet challenging research task in the remote sensing community. In this paper, we address the challenging task of Multi-Image Super-Resolution in Remote Sensing (MISR-RS), aiming to generate a High-Resolution (HR) image from multiple Low-Resolution (LR) images obtained by satellites. Recently, the weak temporal correlations among LR images have attracted increasing attention in the MISR-RS task. However, existing MISR methods treat the LR images as sequences with strong temporal correlations, overlooking spatial correlations and imposing temporal dependencies. To address this problem, we propose a novel end-to-end framework named Enhancing Spatial Correlations in MISR (ESC-MISR), which fully exploits the spatial-temporal relations of multiple images for HR image reconstruction. Specifically, we first introduce a novel fusion module named Multi-Image Spatial Transformer (MIST), which emphasizes parts with clearer global spatial features and enhances the spatial correlations between LR images. Besides, we perform a random shuffle strategy for the sequential inputs of LR images to attenuate temporal dependencies and capture weak temporal correlations in the training stage. Compared with the state-of-the-art methods, our ESC-MISR achieves 0.70dB and 0.76dB cPSNR improvements on the two bands of the PROBA-V dataset respectively, demonstrating the superiority of our method.

Via

Access Paper or Ask Questions

FTF-ER: Feature-Topology Fusion-Based Experience Replay Method for Continual Graph Learning

Jul 28, 2024

Jinhui Pang, Changqing Lin, Xiaoshuai Hao, Rong Yin, Zixuan Wang, Zhihui Zhang, Jinglin He, Huang Tai Sheng

Figure 1 for FTF-ER: Feature-Topology Fusion-Based Experience Replay Method for Continual Graph Learning

Figure 2 for FTF-ER: Feature-Topology Fusion-Based Experience Replay Method for Continual Graph Learning

Figure 3 for FTF-ER: Feature-Topology Fusion-Based Experience Replay Method for Continual Graph Learning

Figure 4 for FTF-ER: Feature-Topology Fusion-Based Experience Replay Method for Continual Graph Learning

Abstract:Continual graph learning (CGL) is an important and challenging task that aims to extend static GNNs to dynamic task flow scenarios. As one of the mainstream CGL methods, the experience replay (ER) method receives widespread attention due to its superior performance. However, existing ER methods focus on identifying samples by feature significance or topological relevance, which limits their utilization of comprehensive graph data. In addition, the topology-based ER methods only consider local topological information and add neighboring nodes to the buffer, which ignores the global topological information and increases memory overhead. To bridge these gaps, we propose a novel method called Feature-Topology Fusion-based Experience Replay (FTF-ER) to effectively mitigate the catastrophic forgetting issue with enhanced efficiency. Specifically, from an overall perspective to maximize the utilization of the entire graph data, we propose a highly complementary approach including both feature and global topological information, which can significantly improve the effectiveness of the sampled nodes. Moreover, to further utilize global topological information, we propose Hodge Potential Score (HPS) as a novel module to calculate the topological importance of nodes. HPS derives a global node ranking via Hodge decomposition on graphs, providing more accurate global topological information compared to neighbor sampling. By excluding neighbor sampling, HPS significantly reduces buffer storage costs for acquiring topological information and simultaneously decreases training time. Compared with state-of-the-art methods, FTF-ER achieves a significant improvement of 3.6% in AA and 7.1% in AF on the OGB-Arxiv dataset, demonstrating its superior performance in the class-incremental learning setting.

* Accepted by ACM Multimedia 2024

Via

Access Paper or Ask Questions

Context-aware Session-based Recommendation with Graph Neural Networks

Oct 14, 2023

Zhihui Zhang, JianXiang Yu, Xiang Li

Abstract:Session-based recommendation (SBR) is a task that aims to predict items based on anonymous sequences of user behaviors in a session. While there are methods that leverage rich context information in sessions for SBR, most of them have the following limitations: 1) they fail to distinguish the item-item edge types when constructing the global graph for exploiting cross-session contexts; 2) they learn a fixed embedding vector for each item, which lacks the flexibility to reflect the variation of user interests across sessions; 3) they generally use the one-hot encoded vector of the target item as the hard label to predict, thus failing to capture the true user preference. To solve these issues, we propose CARES, a novel context-aware session-based recommendation model with graph neural networks, which utilizes different types of contexts in sessions to capture user interests. Specifically, we first construct a multi-relation cross-session graph to connect items according to intra- and cross-session item-level contexts. Further, to encode the variation of user interests, we design personalized item representations. Finally, we employ a label collaboration strategy for generating soft user preference distribution as labels. Experiments on three benchmark datasets demonstrate that CARES consistently outperforms state-of-the-art models in terms of P@20 and MRR@20. Our data and codes are publicly available at https://github.com/brilliantZhang/CARES.

* 10 pages, 10 figures, conference

Via

Access Paper or Ask Questions

Accelerating Generic Graph Neural Networks via Architecture, Compiler, Partition Method Co-Design

Aug 16, 2023

Shuwen Lu, Zhihui Zhang, Cong Guo, Jingwen Leng, Yangjie Zhou, Minyi Guo

Abstract:Graph neural networks (GNNs) have shown significant accuracy improvements in a variety of graph learning domains, sparking considerable research interest. To translate these accuracy improvements into practical applications, it is essential to develop high-performance and efficient hardware acceleration for GNN models. However, designing GNN accelerators faces two fundamental challenges: the high bandwidth requirement of GNN models and the diversity of GNN models. Previous works have addressed the first challenge by using more expensive memory interfaces to achieve higher bandwidth. For the second challenge, existing works either support specific GNN models or have generic designs with poor hardware utilization. In this work, we tackle both challenges simultaneously. First, we identify a new type of partition-level operator fusion, which we utilize to internally reduce the high bandwidth requirement of GNNs. Next, we introduce partition-level multi-threading to schedule the concurrent processing of graph partitions, utilizing different hardware resources. To further reduce the extra on-chip memory required by multi-threading, we propose fine-grained graph partitioning to generate denser graph partitions. Importantly, these three methods make no assumptions about the targeted GNN models, addressing the challenge of model variety. We implement these methods in a framework called SwitchBlade, consisting of a compiler, a graph partitioner, and a hardware accelerator. Our evaluation demonstrates that SwitchBlade achieves an average speedup of $1.85\times$ and energy savings of $19.03\times$ compared to the NVIDIA V100 GPU. Additionally, SwitchBlade delivers performance comparable to state-of-the-art specialized accelerators.

Via

Access Paper or Ask Questions

Architectural Implications of Graph Neural Networks

Sep 02, 2020

Zhihui Zhang, Jingwen Leng, Lingxiao Ma, Youshan Miao, Chao Li, Minyi Guo

Figure 1 for Architectural Implications of Graph Neural Networks

Figure 2 for Architectural Implications of Graph Neural Networks

Figure 3 for Architectural Implications of Graph Neural Networks

Figure 4 for Architectural Implications of Graph Neural Networks

Abstract:Graph neural networks (GNN) represent an emerging line of deep learning models that operate on graph structures. It is becoming more and more popular due to its high accuracy achieved in many graph-related tasks. However, GNN is not as well understood in the system and architecture community as its counterparts such as multi-layer perceptrons and convolutional neural networks. This work tries to introduce the GNN to our community. In contrast to prior work that only presents characterizations of GCNs, our work covers a large portion of the varieties for GNN workloads based on a general GNN description framework. By constructing the models on top of two widely-used libraries, we characterize the GNN computation at inference stage concerning general-purpose and application-specific architectures and hope our work can foster more system and architecture research for GNNs.

* in IEEE Computer Architecture Letters, vol. 19, no. 1, pp. 59-62, 1 Jan.-June 2020
* 4 pages, published in IEEE Computer Architecture Letters (CAL) 2020

Via

Access Paper or Ask Questions