Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yixuan Ma

Information Gain-Guided Causal Intervention for Autonomous Debiasing Large Language Models

Apr 17, 2025

Zhouhao Sun, Xiao Ding, Li Du, Yunpeng Xu, Yixuan Ma, Yang Zhao, Bing Qin, Ting Liu

Abstract:Despite significant progress, recent studies indicate that current large language models (LLMs) may still capture dataset biases and utilize them during inference, leading to the poor generalizability of LLMs. However, due to the diversity of dataset biases and the insufficient nature of bias suppression based on in-context learning, the effectiveness of previous prior knowledge-based debiasing methods and in-context learning based automatic debiasing methods is limited. To address these challenges, we explore the combination of causal mechanisms with information theory and propose an information gain-guided causal intervention debiasing (IGCIDB) framework. This framework first utilizes an information gain-guided causal intervention method to automatically and autonomously balance the distribution of instruction-tuning dataset. Subsequently, it employs a standard supervised fine-tuning process to train LLMs on the debiased dataset. Experimental results show that IGCIDB can effectively debias LLM to improve its generalizability across different tasks.

Via

Access Paper or Ask Questions

Causal-Guided Active Learning for Debiasing Large Language Models

Aug 23, 2024

Zhouhao Sun, Li Du, Xiao Ding, Yixuan Ma, Kaitao Qiu, Ting Liu, Bing Qin

Abstract:Although achieving promising performance, recent analyses show that current generative large language models (LLMs) may still capture dataset biases and utilize them for generation, leading to poor generalizability and harmfulness of LLMs. However, due to the diversity of dataset biases and the over-optimization problem, previous prior-knowledge-based debiasing methods and fine-tuning-based debiasing methods may not be suitable for current LLMs. To address this issue, we explore combining active learning with the causal mechanisms and propose a casual-guided active learning (CAL) framework, which utilizes LLMs itself to automatically and autonomously identify informative biased samples and induce the bias patterns. Then a cost-effective and efficient in-context learning based method is employed to prevent LLMs from utilizing dataset biases during generation. Experimental results show that CAL can effectively recognize typical biased instances and induce various bias patterns for debiasing LLMs.

* ACL main conference

Via

Access Paper or Ask Questions

Entropy Neural Estimation for Graph Contrastive Learning

Jul 26, 2023

Yixuan Ma, Xiaolin Zhang, Peng Zhang, Kun Zhan

Abstract:Contrastive learning on graphs aims at extracting distinguishable high-level representations of nodes. In this paper, we theoretically illustrate that the entropy of a dataset can be approximated by maximizing the lower bound of the mutual information across different views of a graph, \ie, entropy is estimated by a neural network. Based on this finding, we propose a simple yet effective subset sampling strategy to contrast pairwise representations between views of a dataset. In particular, we randomly sample nodes and edges from a given graph to build the input subset for a view. Two views are fed into a parameter-shared Siamese network to extract the high-dimensional embeddings and estimate the information entropy of the entire graph. For the learning process, we propose to optimize the network using two objectives, simultaneously. Concretely, the input of the contrastive loss function consists of positive and negative pairs. Our selection strategy of pairs is different from previous works and we present a novel strategy to enhance the representation ability of the graph encoder by selecting nodes based on cross-view similarities. We enrich the diversity of the positive and negative pairs by selecting highly similar samples and totally different data with the guidance of cross-view similarity scores, respectively. We also introduce a cross-view consistency constraint on the representations generated from the different views. This objective guarantees the learned representations are consistent across views from the perspective of the entire graph. We conduct extensive experiments on seven graph benchmarks, and the proposed approach achieves competitive performance compared to the current state-of-the-art methods. The source code will be publicly released once this paper is accepted.

* ACM MM 2023
* ACM MM 2023 accepted

Via

Access Paper or Ask Questions

Stationary Diffusion State Neural Estimation for Multiview Clustering

Dec 02, 2021

Chenghua Liu, Zhuolin Liao, Yixuan Ma, Kun Zhan

Figure 1 for Stationary Diffusion State Neural Estimation for Multiview Clustering

Figure 2 for Stationary Diffusion State Neural Estimation for Multiview Clustering

Figure 3 for Stationary Diffusion State Neural Estimation for Multiview Clustering

Figure 4 for Stationary Diffusion State Neural Estimation for Multiview Clustering

Abstract:Although many graph-based clustering methods attempt to model the stationary diffusion state in their objectives, their performance limits to using a predefined graph. We argue that the estimation of the stationary diffusion state can be achieved by gradient descent over neural networks. We specifically design the Stationary Diffusion State Neural Estimation (SDSNE) to exploit multiview structural graph information for co-supervised learning. We explore how to design a graph neural network specially for unsupervised multiview learning and integrate multiple graphs into a unified consensus graph by a shared self-attentional module. The view-shared self-attentional module utilizes the graph structure to learn a view-consistent global graph. Meanwhile, instead of using auto-encoder in most unsupervised learning graph neural networks, SDSNE uses a co-supervised strategy with structure information to supervise the model learning. The co-supervised strategy as the loss function guides SDSNE in achieving the stationary state. With the help of the loss and the self-attentional module, we learn to obtain a graph in which nodes in each connected component fully connect by the same weight. Experiments on several multiview datasets demonstrate effectiveness of SDSNE in terms of six clustering evaluation metrics.

* AAAI 2022
* AAAI 2022

Via

Access Paper or Ask Questions

A Comprehensive Study on Learning-Based PE Malware Family Classification Methods

Oct 29, 2021

Yixuan Ma, Shuang Liu, Jiajun Jiang, Guanhong Chen, Keqiu Li

Figure 1 for A Comprehensive Study on Learning-Based PE Malware Family Classification Methods

Figure 2 for A Comprehensive Study on Learning-Based PE Malware Family Classification Methods

Figure 3 for A Comprehensive Study on Learning-Based PE Malware Family Classification Methods

Figure 4 for A Comprehensive Study on Learning-Based PE Malware Family Classification Methods

Abstract:Driven by the high profit, Portable Executable (PE) malware has been consistently evolving in terms of both volume and sophistication. PE malware family classification has gained great attention and a large number of approaches have been proposed. With the rapid development of machine learning techniques and the exciting results they achieved on various tasks, machine learning algorithms have also gained popularity in the PE malware family classification task. Three mainstream approaches that use learning based algorithms, as categorized by the input format the methods take, are image-based, binary-based and disassembly-based approaches. Although a large number of approaches are published, there is no consistent comparisons on those approaches, especially from the practical industry adoption perspective. Moreover, there is no comparison in the scenario of concept drift, which is a fact for the malware classification task due to the fast evolving nature of malware. In this work, we conduct a thorough empirical study on learning-based PE malware classification approaches on 4 different datasets and consistent experiment settings. Based on the experiment results and an interview with our industry partners, we find that (1) there is no individual class of methods that significantly outperforms the others; (2) All classes of methods show performance degradation on concept drift (by an average F1-score of 32.23%); and (3) the prediction time and high memory consumption hinder existing approaches from being adopted for industry usage.

* 12 pages, 3 figures, conference

Via

Access Paper or Ask Questions

An improved helmet detection method for YOLOv3 on an unbalanced dataset

Nov 09, 2020

Rui Geng, Yixuan Ma, Wanhong Huang

Figure 1 for An improved helmet detection method for YOLOv3 on an unbalanced dataset

Figure 2 for An improved helmet detection method for YOLOv3 on an unbalanced dataset

Figure 3 for An improved helmet detection method for YOLOv3 on an unbalanced dataset

Figure 4 for An improved helmet detection method for YOLOv3 on an unbalanced dataset

Abstract:The YOLOv3 target detection algorithm is widely used in industry due to its high speed and high accuracy, but it has some limitations, such as the accuracy degradation of unbalanced datasets. The YOLOv3 target detection algorithm is based on a Gaussian fuzzy data augmentation approach to pre-process the data set and improve the YOLOv3 target detection algorithm. Through the efficient pre-processing, the confidence level of YOLOv3 is generally improved by 0.01-0.02 without changing the recognition speed of YOLOv3, and the processed images also perform better in image localization due to effective feature fusion, which is more in line with the requirement of recognition speed and accuracy in production.

Via

Access Paper or Ask Questions