Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hui He

Circumventing Backdoor Space via Weight Symmetry

Jun 09, 2025

Jie Peng, Hongwei Yang, Jing Zhao, Hengji Dong, Hui He, Weizhe Zhang, Haoyu He

Abstract:Deep neural networks are vulnerable to backdoor attacks, where malicious behaviors are implanted during training. While existing defenses can effectively purify compromised models, they typically require labeled data or specific training procedures, making them difficult to apply beyond supervised learning settings. Notably, recent studies have shown successful backdoor attacks across various learning paradigms, highlighting a critical security concern. To address this gap, we propose Two-stage Symmetry Connectivity (TSC), a novel backdoor purification defense that operates independently of data format and requires only a small fraction of clean samples. Through theoretical analysis, we prove that by leveraging permutation invariance in neural networks and quadratic mode connectivity, TSC amplifies the loss on poisoned samples while maintaining bounded clean accuracy. Experiments demonstrate that TSC achieves robust performance comparable to state-of-the-art methods in supervised learning scenarios. Furthermore, TSC generalizes to self-supervised learning frameworks, such as SimCLR and CLIP, maintaining its strong defense capabilities. Our code is available at https://github.com/JiePeng104/TSC.

Via

Access Paper or Ask Questions

FGSGT: Saliency-Guided Siamese Network Tracker Based on Key Fine-Grained Feature Information for Thermal Infrared Target Tracking

Apr 19, 2025

Ruoyan Xiong, Huanbin Zhang, Shentao Wang, Hui He, Yuke Hou, Yue Zhang, Yujie Cui, Huipan Guan, Shang Zhang

Abstract:Thermal infrared (TIR) images typically lack detailed features and have low contrast, making it challenging for conventional feature extraction models to capture discriminative target characteristics. As a result, trackers are often affected by interference from visually similar objects and are susceptible to tracking drift. To address these challenges, we propose a novel saliency-guided Siamese network tracker based on key fine-grained feature infor-mation. First, we introduce a fine-grained feature parallel learning convolu-tional block with a dual-stream architecture and convolutional kernels of varying sizes. This design captures essential global features from shallow layers, enhances feature diversity, and minimizes the loss of fine-grained in-formation typically encountered in residual connections. In addition, we propose a multi-layer fine-grained feature fusion module that uses bilinear matrix multiplication to effectively integrate features across both deep and shallow layers. Next, we introduce a Siamese residual refinement block that corrects saliency map prediction errors using residual learning. Combined with deep supervision, this mechanism progressively refines predictions, ap-plying supervision at each recursive step to ensure consistent improvements in accuracy. Finally, we present a saliency loss function to constrain the sali-ency predictions, directing the network to focus on highly discriminative fi-ne-grained features. Extensive experiment results demonstrate that the pro-posed tracker achieves the highest precision and success rates on the PTB-TIR and LSOTB-TIR benchmarks. It also achieves a top accuracy of 0.78 on the VOT-TIR 2015 benchmark and 0.75 on the VOT-TIR 2017 benchmark.

Via

Access Paper or Ask Questions

DCFG: Diverse Cross-Channel Fine-Grained Feature Learning and Progressive Fusion Siamese Tracker for Thermal Infrared Target Tracking

Apr 19, 2025

Ruoyan Xiong, Yuke Hou, Princess Retor Torboh, Hui He, Huanbin Zhang, Yue Zhang, Yanpin Wang, Huipan Guan, Shang Zhang

Figure 1 for DCFG: Diverse Cross-Channel Fine-Grained Feature Learning and Progressive Fusion Siamese Tracker for Thermal Infrared Target Tracking

Figure 2 for DCFG: Diverse Cross-Channel Fine-Grained Feature Learning and Progressive Fusion Siamese Tracker for Thermal Infrared Target Tracking

Figure 3 for DCFG: Diverse Cross-Channel Fine-Grained Feature Learning and Progressive Fusion Siamese Tracker for Thermal Infrared Target Tracking

Figure 4 for DCFG: Diverse Cross-Channel Fine-Grained Feature Learning and Progressive Fusion Siamese Tracker for Thermal Infrared Target Tracking

Abstract:To address the challenge of capturing highly discriminative features in ther-mal infrared (TIR) tracking, we propose a novel Siamese tracker based on cross-channel fine-grained feature learning and progressive fusion. First, we introduce a cross-channel fine-grained feature learning network that employs masks and suppression coefficients to suppress dominant target features, en-abling the tracker to capture more detailed and subtle information. The net-work employs a channel rearrangement mechanism to enhance efficient in-formation flow, coupled with channel equalization to reduce parameter count. Additionally, we incorporate layer-by-layer combination units for ef-fective feature extraction and fusion, thereby minimizing parameter redun-dancy and computational complexity. The network further employs feature redirection and channel shuffling strategies to better integrate fine-grained details. Second, we propose a specialized cross-channel fine-grained loss function designed to guide feature groups toward distinct discriminative re-gions of the target, thus improving overall target representation. This loss function includes an inter-channel loss term that promotes orthogonality be-tween channels, maximizing feature diversity and facilitating finer detail capture. Extensive experiments demonstrate that our proposed tracker achieves the highest accuracy, scoring 0.81 on the VOT-TIR 2015 and 0.78 on the VOT-TIR 2017 benchmark, while also outperforming other methods across all evaluation metrics on the LSOTB-TIR and PTB-TIR benchmarks.

Via

Access Paper or Ask Questions

RURANET++: An Unsupervised Learning Method for Diabetic Macular Edema Based on SCSE Attention Mechanisms and Dynamic Multi-Projection Head Clustering

Feb 27, 2025

Wei Yang, Yiran Zhu, Jiayu Shen, Yuhan Tang, Chengchang Pan, Hui He, Yan Su, Honggang Qi

Figure 1 for RURANET++: An Unsupervised Learning Method for Diabetic Macular Edema Based on SCSE Attention Mechanisms and Dynamic Multi-Projection Head Clustering

Figure 2 for RURANET++: An Unsupervised Learning Method for Diabetic Macular Edema Based on SCSE Attention Mechanisms and Dynamic Multi-Projection Head Clustering

Figure 3 for RURANET++: An Unsupervised Learning Method for Diabetic Macular Edema Based on SCSE Attention Mechanisms and Dynamic Multi-Projection Head Clustering

Figure 4 for RURANET++: An Unsupervised Learning Method for Diabetic Macular Edema Based on SCSE Attention Mechanisms and Dynamic Multi-Projection Head Clustering

Abstract:Diabetic Macular Edema (DME), a prevalent complication among diabetic patients, constitutes a major cause of visual impairment and blindness. Although deep learning has achieved remarkable progress in medical image analysis, traditional DME diagnosis still relies on extensive annotated data and subjective ophthalmologist assessments, limiting practical applications. To address this, we present RURANET++, an unsupervised learning-based automated DME diagnostic system. This framework incorporates an optimized U-Net architecture with embedded Spatial and Channel Squeeze & Excitation (SCSE) attention mechanisms to enhance lesion feature extraction. During feature processing, a pre-trained GoogLeNet model extracts deep features from retinal images, followed by PCA-based dimensionality reduction to 50 dimensions for computational efficiency. Notably, we introduce a novel clustering algorithm employing multi-projection heads to explicitly control cluster diversity while dynamically adjusting similarity thresholds, thereby optimizing intra-class consistency and inter-class discrimination. Experimental results demonstrate superior performance across multiple metrics, achieving maximum accuracy (0.8411), precision (0.8593), recall (0.8411), and F1-score (0.8390), with exceptional clustering quality. This work provides an efficient unsupervised solution for DME diagnosis with significant clinical implications.

* 10 pages, 2 figures, 5 tables, submitted to The 28th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2025)

Via

Access Paper or Ask Questions

FilterNet: Harnessing Frequency Filters for Time Series Forecasting

Nov 05, 2024

Kun Yi, Jingru Fei, Qi Zhang, Hui He, Shufeng Hao, Defu Lian, Wei Fan

Figure 1 for FilterNet: Harnessing Frequency Filters for Time Series Forecasting

Figure 2 for FilterNet: Harnessing Frequency Filters for Time Series Forecasting

Figure 3 for FilterNet: Harnessing Frequency Filters for Time Series Forecasting

Figure 4 for FilterNet: Harnessing Frequency Filters for Time Series Forecasting

Abstract:While numerous forecasters have been proposed using different network architectures, the Transformer-based models have state-of-the-art performance in time series forecasting. However, forecasters based on Transformers are still suffering from vulnerability to high-frequency signals, efficiency in computation, and bottleneck in full-spectrum utilization, which essentially are the cornerstones for accurately predicting time series with thousands of points. In this paper, we explore a novel perspective of enlightening signal processing for deep time series forecasting. Inspired by the filtering process, we introduce one simple yet effective network, namely FilterNet, built upon our proposed learnable frequency filters to extract key informative temporal patterns by selectively passing or attenuating certain components of time series signals. Concretely, we propose two kinds of learnable filters in the FilterNet: (i) Plain shaping filter, that adopts a universal frequency kernel for signal filtering and temporal modeling; (ii) Contextual shaping filter, that utilizes filtered frequencies examined in terms of its compatibility with input signals for dependency learning. Equipped with the two filters, FilterNet can approximately surrogate the linear and attention mappings widely adopted in time series literature, while enjoying superb abilities in handling high-frequency noises and utilizing the whole frequency spectrum that is beneficial for forecasting. Finally, we conduct extensive experiments on eight time series forecasting benchmarks, and experimental results have demonstrated our superior performance in terms of both effectiveness and efficiency compared with state-of-the-art methods. Code is available at this repository: https://github.com/aikunyi/FilterNet

* Accepted by NeurIPS 2024

Via

Access Paper or Ask Questions

How to Bridge Spatial and Temporal Heterogeneity in Link Prediction? A Contrastive Method

Nov 01, 2024

Yu Tai, Xinglong Wu, Hongwei Yang, Hui He, Duanjing Chen, Yuanming Shao, Weizhe Zhang

Figure 1 for How to Bridge Spatial and Temporal Heterogeneity in Link Prediction? A Contrastive Method

Figure 2 for How to Bridge Spatial and Temporal Heterogeneity in Link Prediction? A Contrastive Method

Figure 3 for How to Bridge Spatial and Temporal Heterogeneity in Link Prediction? A Contrastive Method

Figure 4 for How to Bridge Spatial and Temporal Heterogeneity in Link Prediction? A Contrastive Method

Abstract:Temporal Heterogeneous Networks play a crucial role in capturing the dynamics and heterogeneity inherent in various real-world complex systems, rendering them a noteworthy research avenue for link prediction. However, existing methods fail to capture the fine-grained differential distribution patterns and temporal dynamic characteristics, which we refer to as spatial heterogeneity and temporal heterogeneity. To overcome such limitations, we propose a novel \textbf{C}ontrastive Learning-based \textbf{L}ink \textbf{P}rediction model, \textbf{CLP}, which employs a multi-view hierarchical self-supervised architecture to encode spatial and temporal heterogeneity. Specifically, aiming at spatial heterogeneity, we develop a spatial feature modeling layer to capture the fine-grained topological distribution patterns from node- and edge-level representations, respectively. Furthermore, aiming at temporal heterogeneity, we devise a temporal information modeling layer to perceive the evolutionary dependencies of dynamic graph topologies from time-level representations. Finally, we encode the spatial and temporal distribution heterogeneity from a contrastive learning perspective, enabling a comprehensive self-supervised hierarchical relation modeling for the link prediction task. Extensive experiments conducted on four real-world dynamic heterogeneous network datasets verify that our \mymodel consistently outperforms the state-of-the-art models, demonstrating an average improvement of 10.10\%, 13.44\% in terms of AUC and AP, respectively.

Via

Access Paper or Ask Questions

Multi-SIGATnet: A multimodal schizophrenia MRI classification algorithm using sparse interaction mechanisms and graph attention networks

Aug 25, 2024

Yuhong Jiao, Jiaqing Miao, Jinnan Gong, Hui He, Ping Liang, Cheng Luo, Ying Tan

Figure 1 for Multi-SIGATnet: A multimodal schizophrenia MRI classification algorithm using sparse interaction mechanisms and graph attention networks

Figure 2 for Multi-SIGATnet: A multimodal schizophrenia MRI classification algorithm using sparse interaction mechanisms and graph attention networks

Figure 3 for Multi-SIGATnet: A multimodal schizophrenia MRI classification algorithm using sparse interaction mechanisms and graph attention networks

Figure 4 for Multi-SIGATnet: A multimodal schizophrenia MRI classification algorithm using sparse interaction mechanisms and graph attention networks

Abstract:Schizophrenia is a serious psychiatric disorder. Its pathogenesis is not completely clear, making it difficult to treat patients precisely. Because of the complicated non-Euclidean network structure of the human brain, learning critical information from brain networks remains difficult. To effectively capture the topological information of brain neural networks, a novel multimodal graph attention network based on sparse interaction mechanism (Multi-SIGATnet) was proposed for SZ classification was proposed for SZ classification. Firstly, structural and functional information were fused into multimodal data to obtain more comprehensive and abundant features for patients with SZ. Subsequently, a sparse interaction mechanism was proposed to effectively extract salient features and enhance the feature representation capability. By enhancing the strong connections and weakening the weak connections between feature information based on an asymmetric convolutional network, high-order interactive features were captured. Moreover, sparse learning strategies were designed to filter out redundant connections to improve model performance. Finally, local and global features were updated in accordance with the topological features and connection weight constraints of the higher-order brain network, the features being projected to the classification target space for disorder classification. The effectiveness of the model is verified on the Center for Biomedical Research Excellence (COBRE) and University of California Los Angeles (UCLA) datasets, achieving 81.9\% and 75.8\% average accuracy, respectively, 4.6\% and 5.5\% higher than the graph attention network (GAT) method. Experiments showed that the Multi-SIGATnet method exhibited good performance in identifying SZ.

Via

Access Paper or Ask Questions

Robust Multivariate Time Series Forecasting against Intra- and Inter-Series Transitional Shift

Jul 18, 2024

Hui He, Qi Zhang, Kun Yi, Xiaojun Xue, Shoujin Wang, Liang Hu, Longbing Cao

Figure 1 for Robust Multivariate Time Series Forecasting against Intra- and Inter-Series Transitional Shift

Figure 2 for Robust Multivariate Time Series Forecasting against Intra- and Inter-Series Transitional Shift

Figure 3 for Robust Multivariate Time Series Forecasting against Intra- and Inter-Series Transitional Shift

Figure 4 for Robust Multivariate Time Series Forecasting against Intra- and Inter-Series Transitional Shift

Abstract:The non-stationary nature of real-world Multivariate Time Series (MTS) data presents forecasting models with a formidable challenge of the time-variant distribution of time series, referred to as distribution shift. Existing studies on the distribution shift mostly adhere to adaptive normalization techniques for alleviating temporal mean and covariance shifts or time-variant modeling for capturing temporal shifts. Despite improving model generalization, these normalization-based methods often assume a time-invariant transition between outputs and inputs but disregard specific intra-/inter-series correlations, while time-variant models overlook the intrinsic causes of the distribution shift. This limits model expressiveness and interpretability of tackling the distribution shift for MTS forecasting. To mitigate such a dilemma, we present a unified Probabilistic Graphical Model to Jointly capturing intra-/inter-series correlations and modeling the time-variant transitional distribution, and instantiate a neural framework called JointPGM for non-stationary MTS forecasting. Specifically, JointPGM first employs multiple Fourier basis functions to learn dynamic time factors and designs two distinct learners: intra-series and inter-series learners. The intra-series learner effectively captures temporal dynamics by utilizing temporal gates, while the inter-series learner explicitly models spatial dynamics through multi-hop propagation, incorporating Gumbel-softmax sampling. These two types of series dynamics are subsequently fused into a latent variable, which is inversely employed to infer time factors, generate final prediction, and perform reconstruction. We validate the effectiveness and efficiency of JointPGM through extensive experiments on six highly non-stationary MTS datasets, achieving state-of-the-art forecasting performance of MTS forecasting.

* 19 pages, 11 figures

Via

Access Paper or Ask Questions

Towards Bridging the Cross-modal Semantic Gap for Multi-modal Recommendation

Jul 07, 2024

Xinglong Wu, Anfeng Huang, Hongwei Yang, Hui He, Yu Tai, Weizhe Zhang

Abstract:Multi-modal recommendation greatly enhances the performance of recommender systems by modeling the auxiliary information from multi-modality contents. Most existing multi-modal recommendation models primarily exploit multimedia information propagation processes to enrich item representations and directly utilize modal-specific embedding vectors independently obtained from upstream pre-trained models. However, this might be inappropriate since the abundant task-specific semantics remain unexplored, and the cross-modality semantic gap hinders the recommendation performance. Inspired by the recent progress of the cross-modal alignment model CLIP, in this paper, we propose a novel \textbf{CLIP} \textbf{E}nhanced \textbf{R}ecommender (\textbf{CLIPER}) framework to bridge the semantic gap between modalities and extract fine-grained multi-view semantic information. Specifically, we introduce a multi-view modality-alignment approach for representation extraction and measure the semantic similarity between modalities. Furthermore, we integrate the multi-view multimedia representations into downstream recommendation models. Extensive experiments conducted on three public datasets demonstrate the consistent superiority of our model over state-of-the-art multi-modal recommendation models.

Via

Access Paper or Ask Questions

Deep Coupling Network For Multivariate Time Series Forecasting

Feb 23, 2024

Kun Yi, Qi Zhang, Hui He, Kaize Shi, Liang Hu, Ning An, Zhendong Niu

Figure 1 for Deep Coupling Network For Multivariate Time Series Forecasting

Figure 2 for Deep Coupling Network For Multivariate Time Series Forecasting

Figure 3 for Deep Coupling Network For Multivariate Time Series Forecasting

Figure 4 for Deep Coupling Network For Multivariate Time Series Forecasting

Abstract:Multivariate time series (MTS) forecasting is crucial in many real-world applications. To achieve accurate MTS forecasting, it is essential to simultaneously consider both intra- and inter-series relationships among time series data. However, previous work has typically modeled intra- and inter-series relationships separately and has disregarded multi-order interactions present within and between time series data, which can seriously degrade forecasting accuracy. In this paper, we reexamine intra- and inter-series relationships from the perspective of mutual information and accordingly construct a comprehensive relationship learning mechanism tailored to simultaneously capture the intricate multi-order intra- and inter-series couplings. Based on the mechanism, we propose a novel deep coupling network for MTS forecasting, named DeepCN, which consists of a coupling mechanism dedicated to explicitly exploring the multi-order intra- and inter-series relationships among time series data concurrently, a coupled variable representation module aimed at encoding diverse variable patterns, and an inference module facilitating predictions through one forward step. Extensive experiments conducted on seven real-world datasets demonstrate that our proposed DeepCN achieves superior performance compared with the state-of-the-art baselines.

Via

Access Paper or Ask Questions