Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yangkun Wang

ToxicChat: Unveiling Hidden Challenges of Toxicity Detection in Real-World User-AI Conversation

Oct 26, 2023

Zi Lin, Zihan Wang, Yongqi Tong, Yangkun Wang, Yuxin Guo, Yujia Wang, Jingbo Shang

Abstract:Despite remarkable advances that large language models have achieved in chatbots, maintaining a non-toxic user-AI interactive environment has become increasingly critical nowadays. However, previous efforts in toxicity detection have been mostly based on benchmarks derived from social media content, leaving the unique challenges inherent to real-world user-AI interactions insufficiently explored. In this work, we introduce ToxicChat, a novel benchmark based on real user queries from an open-source chatbot. This benchmark contains the rich, nuanced phenomena that can be tricky for current toxicity detection models to identify, revealing a significant domain difference compared to social media content. Our systematic evaluation of models trained on existing toxicity datasets has shown their shortcomings when applied to this unique domain of ToxicChat. Our work illuminates the potentially overlooked challenges of toxicity detection in real-world user-AI conversations. In the future, ToxicChat can be a valuable resource to drive further advancements toward building a safe and healthy environment for user-AI interactions.

* EMNLP findings 2023

Via

Access Paper or Ask Questions

Towards Few-shot Entity Recognition in Document Images: A Graph Neural Network Approach Robust to Image Manipulation

May 24, 2023

Prashant Krishnan, Zilong Wang, Yangkun Wang, Jingbo Shang

Abstract:Recent advances of incorporating layout information, typically bounding box coordinates, into pre-trained language models have achieved significant performance in entity recognition from document images. Using coordinates can easily model the absolute position of each token, but they might be sensitive to manipulations in document images (e.g., shifting, rotation or scaling), especially when the training data is limited in few-shot settings. In this paper, we propose to further introduce the topological adjacency relationship among the tokens, emphasizing their relative position information. Specifically, we consider the tokens in the documents as nodes and formulate the edges based on the topological heuristics from the k-nearest bounding boxes. Such adjacency graphs are invariant to affine transformations including shifting, rotations and scaling. We incorporate these graphs into the pre-trained language model by adding graph neural network layers on top of the language model embeddings, leading to a novel model LAGER. Extensive experiments on two benchmark datasets show that LAGER significantly outperforms strong baselines under different few-shot settings and also demonstrate better robustness to manipulations.

Via

Access Paper or Ask Questions

Refined Edge Usage of Graph Neural Networks for Edge Prediction

Dec 25, 2022

Jiarui Jin, Yangkun Wang, Weinan Zhang, Quan Gan, Xiang Song, Yong Yu, Zheng Zhang, David Wipf

Abstract:Graph Neural Networks (GNNs), originally proposed for node classification, have also motivated many recent works on edge prediction (a.k.a., link prediction). However, existing methods lack elaborate design regarding the distinctions between two tasks that have been frequently overlooked: (i) edges only constitute the topology in the node classification task but can be used as both the topology and the supervisions (i.e., labels) in the edge prediction task; (ii) the node classification makes prediction over each individual node, while the edge prediction is determinated by each pair of nodes. To this end, we propose a novel edge prediction paradigm named Edge-aware Message PassIng neuRal nEtworks (EMPIRE). Concretely, we first introduce an edge splitting technique to specify use of each edge where each edge is solely used as either the topology or the supervision (named as topology edge or supervision edge). We then develop a new message passing mechanism that generates the messages to source nodes (through topology edges) being aware of target nodes (through supervision edges). In order to emphasize the differences between pairs connected by supervision edges and pairs unconnected, we further weight the messages to highlight the relative ones that can reflect the differences. In addition, we design a novel negative node-pair sampling trick that efficiently samples 'hard' negative instances in the supervision instances, and can significantly improve the performance. Experimental results verify that the proposed method can significantly outperform existing state-of-the-art models regarding the edge prediction task on multiple homogeneous and heterogeneous graph datasets.

* Pre-print

Via

Access Paper or Ask Questions

Learning Enhanced Representations for Tabular Data via Neighborhood Propagation

Jun 14, 2022

Kounianhua Du, Weinan Zhang, Ruiwen Zhou, Yangkun Wang, Xilong Zhao, Jiarui Jin, Quan Gan, Zheng Zhang, David Wipf

Figure 1 for Learning Enhanced Representations for Tabular Data via Neighborhood Propagation

Figure 2 for Learning Enhanced Representations for Tabular Data via Neighborhood Propagation

Figure 3 for Learning Enhanced Representations for Tabular Data via Neighborhood Propagation

Figure 4 for Learning Enhanced Representations for Tabular Data via Neighborhood Propagation

Abstract:Prediction over tabular data is an essential and fundamental problem in many important downstream tasks. However, existing methods either take a data instance of the table independently as input or do not fully utilize the multi-rows features and labels to directly change and enhance the target data representations. In this paper, we propose to 1) construct a hypergraph from relevant data instance retrieval to model the cross-row and cross-column patterns of those instances, and 2) perform message Propagation to Enhance the target data instance representation for Tabular prediction tasks. Specifically, our specially-designed message propagation step benefits from 1) fusion of label and features during propagation, and 2) locality-aware high-order feature interactions. Experiments on two important tabular data prediction tasks validate the superiority of the proposed PET model against other baselines. Additionally, we demonstrate the effectiveness of the model components and the feature enhancement ability of PET via various ablation studies and visualizations. The code is included in https://github.com/KounianhuaDu/PET.

Via

Access Paper or Ask Questions

Implicit vs Unfolded Graph Neural Networks

Nov 12, 2021

Yongyi Yang, Yangkun Wang, Zengfeng Huang, David Wipf

Figure 1 for Implicit vs Unfolded Graph Neural Networks

Figure 2 for Implicit vs Unfolded Graph Neural Networks

Figure 3 for Implicit vs Unfolded Graph Neural Networks

Abstract:It has been observed that graph neural networks (GNN) sometimes struggle to maintain a healthy balance between modeling long-range dependencies across nodes while avoiding unintended consequences such as oversmoothed node representations. To address this issue (among other things), two separate strategies have recently been proposed, namely implicit and unfolded GNNs. The former treats node representations as the fixed points of a deep equilibrium model that can efficiently facilitate arbitrary implicit propagation across the graph with a fixed memory footprint. In contrast, the latter involves treating graph propagation as the unfolded descent iterations as applied to some graph-regularized energy function. While motivated differently, in this paper we carefully elucidate the similarity and differences of these methods, quantifying explicit situations where the solutions they produced may actually be equivalent and others where behavior diverges. This includes the analysis of convergence, representational capacity, and interpretability. We also provide empirical head-to-head comparisons across a variety of synthetic and public real-world benchmarks.

Via

Access Paper or Ask Questions

Convergent Boosted Smoothing for Modeling Graph Data with Tabular Node Features

Oct 26, 2021

Jiuhai Chen, Jonas Mueller, Vassilis N. Ioannidis, Soji Adeshina, Yangkun Wang, Tom Goldstein, David Wipf

Figure 1 for Convergent Boosted Smoothing for Modeling Graph Data with Tabular Node Features

Figure 2 for Convergent Boosted Smoothing for Modeling Graph Data with Tabular Node Features

Figure 3 for Convergent Boosted Smoothing for Modeling Graph Data with Tabular Node Features

Abstract:For supervised learning with tabular data, decision tree ensembles produced via boosting techniques generally dominate real-world applications involving iid training/test sets. However for graph data where the iid assumption is violated due to structured relations between samples, it remains unclear how to best incorporate this structure within existing boosting pipelines. To this end, we propose a generalized framework for iterating boosting with graph propagation steps that share node/sample information across edges connecting related samples. Unlike previous efforts to integrate graph-based models with boosting, our approach is anchored in a principled meta loss function such that provable convergence can be guaranteed under relatively mild assumptions. Across a variety of non-iid graph datasets with tabular node features, our method achieves comparable or superior performance than both tabular and graph neural network models, as well as existing hybrid strategies that combine the two. Beyond producing better predictive performance than recently proposed graph models, our proposed techniques are easy to implement, computationally more efficient, and enjoy stronger theoretical guarantees (which make our results more reproducible).

Via

Access Paper or Ask Questions

Why Propagate Alone? Parallel Use of Labels and Features on Graphs

Oct 14, 2021

Yangkun Wang, Jiarui Jin, Weinan Zhang, Yongyi Yang, Jiuhai Chen, Quan Gan, Yong Yu, Zheng Zhang, Zengfeng Huang, David Wipf

Figure 1 for Why Propagate Alone? Parallel Use of Labels and Features on Graphs

Figure 2 for Why Propagate Alone? Parallel Use of Labels and Features on Graphs

Figure 3 for Why Propagate Alone? Parallel Use of Labels and Features on Graphs

Figure 4 for Why Propagate Alone? Parallel Use of Labels and Features on Graphs

Abstract:Graph neural networks (GNNs) and label propagation represent two interrelated modeling strategies designed to exploit graph structure in tasks such as node property prediction. The former is typically based on stacked message-passing layers that share neighborhood information to transform node features into predictive embeddings. In contrast, the latter involves spreading label information to unlabeled nodes via a parameter-free diffusion process, but operates independently of the node features. Given then that the material difference is merely whether features or labels are smoothed across the graph, it is natural to consider combinations of the two for improving performance. In this regard, it has recently been proposed to use a randomly-selected portion of the training labels as GNN inputs, concatenated with the original node features for making predictions on the remaining labels. This so-called label trick accommodates the parallel use of features and labels, and is foundational to many of the top-ranking submissions on the Open Graph Benchmark (OGB) leaderboard. And yet despite its wide-spread adoption, thus far there has been little attempt to carefully unpack exactly what statistical properties the label trick introduces into the training pipeline, intended or otherwise. To this end, we prove that under certain simplifying assumptions, the stochastic label trick can be reduced to an interpretable, deterministic training objective composed of two factors. The first is a data-fitting term that naturally resolves potential label leakage issues, while the second serves as a regularization factor conditioned on graph structure that adapts to graph size and connectivity. Later, we leverage this perspective to motivate a broader range of label trick use cases, and provide experiments to verify the efficacy of these extensions.

Via

Access Paper or Ask Questions

Bag of Tricks of Semi-Supervised Classification with Graph Neural Networks

Mar 24, 2021

Yangkun Wang

Figure 1 for Bag of Tricks of Semi-Supervised Classification with Graph Neural Networks

Figure 2 for Bag of Tricks of Semi-Supervised Classification with Graph Neural Networks

Figure 3 for Bag of Tricks of Semi-Supervised Classification with Graph Neural Networks

Figure 4 for Bag of Tricks of Semi-Supervised Classification with Graph Neural Networks

Abstract:Much of the recent progress made in node classification on graphs can be credited to the careful design on graph neural networks (GNN) and label propagation algorithms. However, in the literature, in addition to improvements to the model architecture, there are a number of improvements either briefly mentioned as implementation details or visible only in source code, and these overlooked techniques may play a pivotal role in their practical use. In this paper, we first summarize a collection of existing refinements, and then propose several novel techniques regarding these model designs and label usage. We empirically evaluate their impacts on the final model accuracy through ablation studies, and show that we are able to significantly improve various GNN models to the extent that they outweigh the gains from model architecture improvement. Notably, many of the top-ranked models on Open Graph Benchmark benefit from our techniques.

Via

Access Paper or Ask Questions

Graph Neural Networks Inspired by Classical Iterative Algorithms

Mar 10, 2021

Yongyi Yang, Tang Liu, Yangkun Wang, Jinjing Zhou, Quan Gan, Zhewei Wei, Zheng Zhang, Zengfeng Huang, David Wipf

Figure 1 for Graph Neural Networks Inspired by Classical Iterative Algorithms

Figure 2 for Graph Neural Networks Inspired by Classical Iterative Algorithms

Figure 3 for Graph Neural Networks Inspired by Classical Iterative Algorithms

Figure 4 for Graph Neural Networks Inspired by Classical Iterative Algorithms

Abstract:Despite the recent success of graph neural networks (GNN), common architectures often exhibit significant limitations, including sensitivity to oversmoothing, long-range dependencies, and spurious edges, e.g., as can occur as a result of graph heterophily or adversarial attacks. To at least partially address these issues within a simple transparent framework, we consider a new family of GNN layers designed to mimic and integrate the update rules of two classical iterative algorithms, namely, proximal gradient descent and iterative reweighted least squares (IRLS). The former defines an extensible base GNN architecture that is immune to oversmoothing while nonetheless capturing long-range dependencies by allowing arbitrary propagation steps. In contrast, the latter produces a novel attention mechanism that is explicitly anchored to an underlying end-toend energy function, contributing stability with respect to edge uncertainty. When combined we obtain an extremely simple yet robust model that we evaluate across disparate scenarios including standardized benchmarks, adversarially-perturbated graphs, graphs with heterophily, and graphs involving long-range dependencies. In doing so, we compare against SOTA GNN approaches that have been explicitly designed for the respective task, achieving competitive or superior node classification accuracy.

* Under review

Via

Access Paper or Ask Questions