Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pengfei Jiao

Data Poisoning in Deep Learning: A Survey

Mar 27, 2025

Pinlong Zhao, Weiyao Zhu, Pengfei Jiao, Di Gao, Ou Wu

Abstract:Deep learning has become a cornerstone of modern artificial intelligence, enabling transformative applications across a wide range of domains. As the core element of deep learning, the quality and security of training data critically influence model performance and reliability. However, during the training process, deep learning models face the significant threat of data poisoning, where attackers introduce maliciously manipulated training data to degrade model accuracy or lead to anomalous behavior. While existing surveys provide valuable insights into data poisoning, they generally adopt a broad perspective, encompassing both attacks and defenses, but lack a dedicated, in-depth analysis of poisoning attacks specifically in deep learning. In this survey, we bridge this gap by presenting a comprehensive and targeted review of data poisoning in deep learning. First, this survey categorizes data poisoning attacks across multiple perspectives, providing an in-depth analysis of their characteristics and underlying design princinples. Second, the discussion is extended to the emerging area of data poisoning in large language models(LLMs). Finally, we explore critical open challenges in the field and propose potential research directions to advance the field further. To support further exploration, an up-to-date repository of resources on data poisoning in deep learning is available at https://github.com/Pinlong-Zhao/Data-Poisoning.

Via

Access Paper or Ask Questions

MSE-Nets: Multi-annotated Semi-supervised Ensemble Networks for Improving Segmentation of Medical Image with Ambiguous Boundaries

Nov 17, 2023

Shuai Wang, Tengjin Weng, Jingyi Wang, Yang Shen, Zhidong Zhao, Yixiu Liu, Pengfei Jiao, Zhiming Cheng, Yaqi Wang

Figure 1 for MSE-Nets: Multi-annotated Semi-supervised Ensemble Networks for Improving Segmentation of Medical Image with Ambiguous Boundaries

Figure 2 for MSE-Nets: Multi-annotated Semi-supervised Ensemble Networks for Improving Segmentation of Medical Image with Ambiguous Boundaries

Figure 3 for MSE-Nets: Multi-annotated Semi-supervised Ensemble Networks for Improving Segmentation of Medical Image with Ambiguous Boundaries

Figure 4 for MSE-Nets: Multi-annotated Semi-supervised Ensemble Networks for Improving Segmentation of Medical Image with Ambiguous Boundaries

Abstract:Medical image segmentation annotations exhibit variations among experts due to the ambiguous boundaries of segmented objects and backgrounds in medical images. Although using multiple annotations for each image in the fully-supervised has been extensively studied for training deep models, obtaining a large amount of multi-annotated data is challenging due to the substantial time and manpower costs required for segmentation annotations, resulting in most images lacking any annotations. To address this, we propose Multi-annotated Semi-supervised Ensemble Networks (MSE-Nets) for learning segmentation from limited multi-annotated and abundant unannotated data. Specifically, we introduce the Network Pairwise Consistency Enhancement (NPCE) module and Multi-Network Pseudo Supervised (MNPS) module to enhance MSE-Nets for the segmentation task by considering two major factors: (1) to optimize the utilization of all accessible multi-annotated data, the NPCE separates (dis)agreement annotations of multi-annotated data at the pixel level and handles agreement and disagreement annotations in different ways, (2) to mitigate the introduction of imprecise pseudo-labels, the MNPS extends the training data by leveraging consistent pseudo-labels from unannotated data. Finally, we improve confidence calibration by averaging the predictions of base networks. Experiments on the ISIC dataset show that we reduced the demand for multi-annotated data by 97.75\% and narrowed the gap with the best fully-supervised baseline to just a Jaccard index of 4\%. Furthermore, compared to other semi-supervised methods that rely only on a single annotation or a combined fusion approach, the comprehensive experimental results on ISIC and RIGA datasets demonstrate the superior performance of our proposed method in medical image segmentation with ambiguous boundaries.

Via

Access Paper or Ask Questions

Temporal Graph Representation Learning with Adaptive Augmentation Contrastive

Nov 07, 2023

Hongjiang Chen, Pengfei Jiao, Huijun Tang, Huaming Wu

Abstract:Temporal graph representation learning aims to generate low-dimensional dynamic node embeddings to capture temporal information as well as structural and property information. Current representation learning methods for temporal networks often focus on capturing fine-grained information, which may lead to the model capturing random noise instead of essential semantic information. While graph contrastive learning has shown promise in dealing with noise, it only applies to static graphs or snapshots and may not be suitable for handling time-dependent noise. To alleviate the above challenge, we propose a novel Temporal Graph representation learning with Adaptive augmentation Contrastive (TGAC) model. The adaptive augmentation on the temporal graph is made by combining prior knowledge with temporal information, and the contrastive objective function is constructed by defining the augmented inter-view contrast and intra-view contrast. To complement TGAC, we propose three adaptive augmentation strategies that modify topological features to reduce noise from the network. Our extensive experiments on various real networks demonstrate that the proposed model outperforms other temporal graph representation learning methods.

* Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 683-699. 2023

Via

Access Paper or Ask Questions

Representation Learning on Heterostructures via Heterogeneous Anonymous Walks

Jan 18, 2022

Xuan Guo, Pengfei Jiao, Ting Pan, Wang Zhang, Mengyu Jia, Danyang Shi, Wenjun Wang

Figure 1 for Representation Learning on Heterostructures via Heterogeneous Anonymous Walks

Figure 2 for Representation Learning on Heterostructures via Heterogeneous Anonymous Walks

Figure 3 for Representation Learning on Heterostructures via Heterogeneous Anonymous Walks

Figure 4 for Representation Learning on Heterostructures via Heterogeneous Anonymous Walks

Abstract:Capturing structural similarity has been a hot topic in the field of network embedding recently due to its great help in understanding the node functions and behaviors. However, existing works have paid very much attention to learning structures on homogeneous networks while the related study on heterogeneous networks is still a void. In this paper, we try to take the first step for representation learning on heterostructures, which is very challenging due to their highly diverse combinations of node types and underlying structures. To effectively distinguish diverse heterostructures, we firstly propose a theoretically guaranteed technique called heterogeneous anonymous walk (HAW) and its variant coarse HAW (CHAW). Then, we devise the heterogeneous anonymous walk embedding (HAWE) and its variant coarse HAWE in a data-driven manner to circumvent using an extremely large number of possible walks and train embeddings by predicting occurring walks in the neighborhood of each node. Finally, we design and apply extensive and illustrative experiments on synthetic and real-world networks to build a benchmark on heterostructure learning and evaluate the effectiveness of our methods. The results demonstrate our methods achieve outstanding performance compared with both homogeneous and heterogeneous classic methods, and can be applied on large-scale networks.

* 13 pages, 6 figures, 5 tables

Via

Access Paper or Ask Questions

Block Modeling-Guided Graph Convolutional Neural Networks

Dec 28, 2021

Dongxiao He, Chundong Liang, Huixin Liu, Mingxiang Wen, Pengfei Jiao, Zhiyong Feng

Figure 1 for Block Modeling-Guided Graph Convolutional Neural Networks

Figure 2 for Block Modeling-Guided Graph Convolutional Neural Networks

Figure 3 for Block Modeling-Guided Graph Convolutional Neural Networks

Figure 4 for Block Modeling-Guided Graph Convolutional Neural Networks

Abstract:Graph Convolutional Network (GCN) has shown remarkable potential of exploring graph representation. However, the GCN aggregating mechanism fails to generalize to networks with heterophily where most nodes have neighbors from different classes, which commonly exists in real-world networks. In order to make the propagation and aggregation mechanism of GCN suitable for both homophily and heterophily (or even their mixture), we introduce block modeling into the framework of GCN so that it can realize "block-guided classified aggregation", and automatically learn the corresponding aggregation rules for neighbors of different classes. By incorporating block modeling into the aggregation process, GCN is able to aggregate information from homophilic and heterophilic neighbors discriminately according to their homophily degree. We compared our algorithm with state-of-art methods which deal with the heterophily problem. Empirical results demonstrate the superiority of our new approach over existing methods in heterophilic datasets while maintaining a competitive performance in homophilic datasets.

* Accepted by Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22)

Via

Access Paper or Ask Questions

A Survey on Role-Oriented Network Embedding

Jul 18, 2021

Pengfei Jiao, Xuan Guo, Ting Pan, Wang Zhang, Yulong Pei

Figure 1 for A Survey on Role-Oriented Network Embedding

Figure 2 for A Survey on Role-Oriented Network Embedding

Figure 3 for A Survey on Role-Oriented Network Embedding

Figure 4 for A Survey on Role-Oriented Network Embedding

Abstract:Recently, Network Embedding (NE) has become one of the most attractive research topics in machine learning and data mining. NE approaches have achieved promising performance in various of graph mining tasks including link prediction and node clustering and classification. A wide variety of NE methods focus on the proximity of networks. They learn community-oriented embedding for each node, where the corresponding representations are similar if two nodes are closer to each other in the network. Meanwhile, there is another type of structural similarity, i.e., role-based similarity, which is usually complementary and completely different from the proximity. In order to preserve the role-based structural similarity, the problem of role-oriented NE is raised. However, compared to community-oriented NE problem, there are only a few role-oriented embedding approaches proposed recently. Although less explored, considering the importance of roles in analyzing networks and many applications that role-oriented NE can shed light on, it is necessary and timely to provide a comprehensive overview of existing role-oriented NE methods. In this review, we first clarify the differences between community-oriented and role-oriented network embedding. Afterwards, we propose a general framework for understanding role-oriented NE and a two-level categorization to better classify existing methods. Then, we select some representative methods according to the proposed categorization and briefly introduce them by discussing their motivation, development and differences. Moreover, we conduct comprehensive experiments to empirically evaluate these methods on a variety of role-related tasks including node classification and clustering (role discovery), top-k similarity search and visualization using some widely used synthetic and real-world datasets...

* 20 pages,9 figures, 5 tables

Via

Access Paper or Ask Questions

A Survey of Community Detection Approaches: From Statistical Modeling to Deep Learning

Jan 03, 2021

Di Jin, Zhizhi Yu, Pengfei Jiao, Shirui Pan, Philip S. Yu, Weixiong Zhang

Figure 1 for A Survey of Community Detection Approaches: From Statistical Modeling to Deep Learning

Figure 2 for A Survey of Community Detection Approaches: From Statistical Modeling to Deep Learning

Figure 3 for A Survey of Community Detection Approaches: From Statistical Modeling to Deep Learning

Figure 4 for A Survey of Community Detection Approaches: From Statistical Modeling to Deep Learning

Abstract:Community detection, a fundamental task for network analysis, aims to partition a network into multiple sub-structures to help reveal their latent functions. Community detection has been extensively studied in and broadly applied to many real-world network problems. Classical approaches to community detection typically utilize probabilistic graphical models and adopt a variety of prior knowledge to infer community structures. As the problems that network methods try to solve and the network data to be analyzed become increasingly more sophisticated, new approaches have also been proposed and developed, particularly those that utilize deep learning and convert networked data into low dimensional representation. Despite all the recent advancement, there is still a lack of insightful understanding of the theoretical and methodological underpinning of community detection, which will be critically important for future development of the area of network analysis. In this paper, we develop and present a unified architecture of network community-finding methods to characterize the state-of-the-art of the field of community detection. Specifically, we provide a comprehensive review of the existing community detection methods and introduce a new taxonomy that divides the existing methods into two categories, namely probabilistic graphical model and deep learning. We then discuss in detail the main idea behind each method in the two categories. Furthermore, to promote future development of community detection, we release several benchmark datasets from several problem domains and highlight their applications to various network analysis tasks. We conclude with discussions of the challenges of the field and suggestions of possible directions for future research.

Via

Access Paper or Ask Questions

NRPA: Neural Recommendation with Personalized Attention

May 29, 2019

Hongtao Liu, Fangzhao Wu, Wenjun Wang, Xianchen Wang, Pengfei Jiao, Chuhan Wu, Xing Xie

Figure 1 for NRPA: Neural Recommendation with Personalized Attention

Figure 2 for NRPA: Neural Recommendation with Personalized Attention

Figure 3 for NRPA: Neural Recommendation with Personalized Attention

Figure 4 for NRPA: Neural Recommendation with Personalized Attention

Abstract:Existing review-based recommendation methods usually use the same model to learn the representations of all users/items from reviews posted by users towards items. However, different users have different preference and different items have different characteristics. Thus, the same word or similar reviews may have different informativeness for different users and items. In this paper we propose a neural recommendation approach with personalized attention to learn personalized representations of users and items from reviews. We use a review encoder to learn representations of reviews from words, and a user/item encoder to learn representations of users or items from reviews. We propose a personalized attention model, and apply it to both review and user/item encoders to select different important words and reviews for different users/items. Experiments on five datasets validate our approach can effectively improve the performance of neural recommendation.

* sigir 2019
* 4 pages, 4 figures

Via

Access Paper or Ask Questions