Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bo Dong

CLDG: Contrastive Learning on Dynamic Graphs

Dec 19, 2024

Yiming Xu, Bin Shi, Teng Ma, Bo Dong, Haoyi Zhou, Qinghua Zheng

Figure 1 for CLDG: Contrastive Learning on Dynamic Graphs

Figure 2 for CLDG: Contrastive Learning on Dynamic Graphs

Figure 3 for CLDG: Contrastive Learning on Dynamic Graphs

Figure 4 for CLDG: Contrastive Learning on Dynamic Graphs

Abstract:The graph with complex annotations is the most potent data type, whose constantly evolving motivates further exploration of the unsupervised dynamic graph representation. One of the representative paradigms is graph contrastive learning. It constructs self-supervised signals by maximizing the mutual information between the statistic graph's augmentation views. However, the semantics and labels may change within the augmentation process, causing a significant performance drop in downstream tasks. This drawback becomes greatly magnified on dynamic graphs. To address this problem, we designed a simple yet effective framework named CLDG. Firstly, we elaborate that dynamic graphs have temporal translation invariance at different levels. Then, we proposed a sampling layer to extract the temporally-persistent signals. It will encourage the node to maintain consistent local and global representations, i.e., temporal translation invariance under the timespan views. The extensive experiments demonstrate the effectiveness and efficiency of the method on seven datasets by outperforming eight unsupervised state-of-the-art baselines and showing competitiveness against four semi-supervised methods. Compared with the existing dynamic graph method, the number of model parameters and training time is reduced by an average of 2,001.86 times and 130.31 times on seven datasets, respectively.

* Accepted by ICDE2023

Via

Access Paper or Ask Questions

Training a Label-Noise-Resistant GNN with Reduced Complexity

Nov 17, 2024

Rui Zhao, Bin Shi, Zhiming Liang, Jianfei Ruan, Bo Dong, Lu Lin

Figure 1 for Training a Label-Noise-Resistant GNN with Reduced Complexity

Figure 2 for Training a Label-Noise-Resistant GNN with Reduced Complexity

Figure 3 for Training a Label-Noise-Resistant GNN with Reduced Complexity

Figure 4 for Training a Label-Noise-Resistant GNN with Reduced Complexity

Abstract:Graph Neural Networks (GNNs) have been widely employed for semi-supervised node classification tasks on graphs. However, the performance of GNNs is significantly affected by label noise, that is, a small amount of incorrectly labeled nodes can substantially misguide model training. Mainstream solutions define node classification with label noise (NCLN) as a reliable labeling task, often introducing node similarity with quadratic computational complexity to more accurately assess label reliability. To this end, in this paper, we introduce the Label Ensemble Graph Neural Network (LEGNN), a lower complexity method for robust GNNs training against label noise. LEGNN reframes NCLN as a label ensemble task, gathering informative multiple labels instead of constructing a single reliable label, avoiding high-complexity computations for reliability assessment. Specifically, LEGNN conducts a two-step process: bootstrapping neighboring contexts and robust learning with gathered multiple labels. In the former step, we apply random neighbor masks for each node and gather the predicted labels as a high-probability label set. This mitigates the impact of inaccurately labeled neighbors and diversifies the label set. In the latter step, we utilize a partial label learning based strategy to aggregate the high-probability label information for model training. Additionally, we symmetrically gather a low-probability label set to counteract potential noise from the bootstrapped high-probability label set. Extensive experiments on six datasets demonstrate that LEGNN achieves outstanding performance while ensuring efficiency. Moreover, it exhibits good scalability on dataset with over one hundred thousand nodes and one million edges.

Via

Access Paper or Ask Questions

DEGAS: Detailed Expressions on Full-Body Gaussian Avatars

Aug 20, 2024

Zhijing Shao, Duotun Wang, Qing-Yao Tian, Yao-Dong Yang, Hengyu Meng, Zeyu Cai, Bo Dong, Yu Zhang, Kang Zhang, Zeyu Wang

Figure 1 for DEGAS: Detailed Expressions on Full-Body Gaussian Avatars

Figure 2 for DEGAS: Detailed Expressions on Full-Body Gaussian Avatars

Figure 3 for DEGAS: Detailed Expressions on Full-Body Gaussian Avatars

Figure 4 for DEGAS: Detailed Expressions on Full-Body Gaussian Avatars

Abstract:Although neural rendering has made significant advancements in creating lifelike, animatable full-body and head avatars, incorporating detailed expressions into full-body avatars remains largely unexplored. We present DEGAS, the first 3D Gaussian Splatting (3DGS)-based modeling method for full-body avatars with rich facial expressions. Trained on multiview videos of a given subject, our method learns a conditional variational autoencoder that takes both the body motion and facial expression as driving signals to generate Gaussian maps in the UV layout. To drive the facial expressions, instead of the commonly used 3D Morphable Models (3DMMs) in 3D head avatars, we propose to adopt the expression latent space trained solely on 2D portrait images, bridging the gap between 2D talking faces and 3D avatars. Leveraging the rendering capability of 3DGS and the rich expressiveness of the expression latent space, the learned avatars can be reenacted to reproduce photorealistic rendering images with subtle and accurate facial expressions. Experiments on an existing dataset and our newly proposed dataset of full-body talking avatars demonstrate the efficacy of our method. We also propose an audio-driven extension of our method with the help of 2D talking faces, opening new possibilities to interactive AI agents.

Via

Access Paper or Ask Questions

Estimating Noisy Class Posterior with Part-level Labels for Noisy Label Learning

May 08, 2024

Rui Zhao, Bin Shi, Jianfei Ruan, Tianze Pan, Bo Dong

Abstract:In noisy label learning, estimating noisy class posteriors plays a fundamental role for developing consistent classifiers, as it forms the basis for estimating clean class posteriors and the transition matrix. Existing methods typically learn noisy class posteriors by training a classification model with noisy labels. However, when labels are incorrect, these models may be misled to overemphasize the feature parts that do not reflect the instance characteristics, resulting in significant errors in estimating noisy class posteriors. To address this issue, this paper proposes to augment the supervised information with part-level labels, encouraging the model to focus on and integrate richer information from various parts. Specifically, our method first partitions features into distinct parts by cropping instances, yielding part-level labels associated with these various parts. Subsequently, we introduce a novel single-to-multiple transition matrix to model the relationship between the noisy and part-level labels, which incorporates part-level labels into a classifier-consistent framework. Utilizing this framework with part-level labels, we can learn the noisy class posteriors more precisely by guiding the model to integrate information from various parts, ultimately improving the classification performance. Our method is theoretically sound, while experiments show that it is empirically effective in synthetic and real-world noisy benchmarks.

* CVPR 2024

Via

Access Paper or Ask Questions

GraphPub: Generation of Differential Privacy Graph with High Availability

Mar 05, 2024

Wanghan Xu, Bin Shi, Ao Liu, Jiqiang Zhang, Bo Dong

Abstract:In recent years, with the rapid development of graph neural networks (GNN), more and more graph datasets have been published for GNN tasks. However, when an upstream data owner publishes graph data, there are often many privacy concerns, because many real-world graph data contain sensitive information like person's friend list. Differential privacy (DP) is a common method to protect privacy, but due to the complex topological structure of graph data, applying DP on graphs often affects the message passing and aggregation of GNN models, leading to a decrease in model accuracy. In this paper, we propose a novel graph edge protection framework, graph publisher (GraphPub), which can protect graph topology while ensuring that the availability of data is basically unchanged. Through reverse learning and the encoder-decoder mechanism, we search for some false edges that do not have a large negative impact on the aggregation of node features, and use them to replace some real edges. The modified graph will be published, which is difficult to distinguish between real and false data. Sufficient experiments prove that our framework achieves model accuracy close to the original graph with an extremely low privacy budget.

Via

Access Paper or Ask Questions

Homography Initialization and Dynamic Weighting Algorithm Based on a Downward-Looking Camera and IMU

Nov 16, 2023

Bo Dong, Yongkang Tao, Deng Peng, Zhigang Fu

Figure 1 for Homography Initialization and Dynamic Weighting Algorithm Based on a Downward-Looking Camera and IMU

Figure 2 for Homography Initialization and Dynamic Weighting Algorithm Based on a Downward-Looking Camera and IMU

Figure 3 for Homography Initialization and Dynamic Weighting Algorithm Based on a Downward-Looking Camera and IMU

Figure 4 for Homography Initialization and Dynamic Weighting Algorithm Based on a Downward-Looking Camera and IMU

Abstract:In recent years, the technology in visual-inertial odometry (VIO) has matured considerably and has been widely used in many applications. However, we still encounter challenges when applying VIO to a micro air vehicle (MAV) equipped with a downward-looking camera. Specifically, VIO cannot compute the correct initialization results during take-off and the cumulative drift is large when the MAV is flying in the air. To overcome these problems, we propose a homographybased initialization method, which utilizes the fact that the features detected by the downward-looking camera during take-off are approximately on the same plane. Then we introduce the prior normal vector and motion field to make states more accurate. In addition, to deal with the cumulative drift, a strategy for dynamically weighting visual residuals is proposed. Finally, we evaluate our method on the collected real-world datasets. The results demonstrate that our system can be successfully initialized no matter how the MAV takes off and the positioning errors are also greatly improved.

Via

Access Paper or Ask Questions

Efficient LLM Inference on CPUs

Nov 01, 2023

Haihao Shen, Hanwen Chang, Bo Dong, Yu Luo, Hengyu Meng

Figure 1 for Efficient LLM Inference on CPUs

Figure 2 for Efficient LLM Inference on CPUs

Figure 3 for Efficient LLM Inference on CPUs

Figure 4 for Efficient LLM Inference on CPUs

Abstract:Large language models (LLMs) have demonstrated remarkable performance and tremendous potential across a wide range of tasks. However, deploying these models has been challenging due to the astronomical amount of model parameters, which requires a demand for large memory capacity and high memory bandwidth. In this paper, we propose an effective approach that can make the deployment of LLMs more efficiently. We support an automatic INT4 weight-only quantization flow and design a special LLM runtime with highly-optimized kernels to accelerate the LLM inference on CPUs. We demonstrate the general applicability of our approach on popular LLMs including Llama2, Llama, GPT-NeoX, and showcase the extreme inference efficiency on CPUs. The code is publicly available at: https://github.com/intel/intel-extension-for-transformers.

* NeurIPS'2023 on Efficient Natural Language and Speech Processing

Via

Access Paper or Ask Questions

A Geometrical Approach to Evaluate the Adversarial Robustness of Deep Neural Networks

Oct 10, 2023

Yang Wang, Bo Dong, Ke Xu, Haiyin Piao, Yufei Ding, Baocai Yin, Xin Yang

Figure 1 for A Geometrical Approach to Evaluate the Adversarial Robustness of Deep Neural Networks

Figure 2 for A Geometrical Approach to Evaluate the Adversarial Robustness of Deep Neural Networks

Figure 3 for A Geometrical Approach to Evaluate the Adversarial Robustness of Deep Neural Networks

Figure 4 for A Geometrical Approach to Evaluate the Adversarial Robustness of Deep Neural Networks

Abstract:Deep Neural Networks (DNNs) are widely used for computer vision tasks. However, it has been shown that deep models are vulnerable to adversarial attacks, i.e., their performances drop when imperceptible perturbations are made to the original inputs, which may further degrade the following visual tasks or introduce new problems such as data and privacy security. Hence, metrics for evaluating the robustness of deep models against adversarial attacks are desired. However, previous metrics are mainly proposed for evaluating the adversarial robustness of shallow networks on the small-scale datasets. Although the Cross Lipschitz Extreme Value for nEtwork Robustness (CLEVER) metric has been proposed for large-scale datasets (e.g., the ImageNet dataset), it is computationally expensive and its performance relies on a tractable number of samples. In this paper, we propose the Adversarial Converging Time Score (ACTS), an attack-dependent metric that quantifies the adversarial robustness of a DNN on a specific input. Our key observation is that local neighborhoods on a DNN's output surface would have different shapes given different inputs. Hence, given different inputs, it requires different time for converging to an adversarial sample. Based on this geometry meaning, ACTS measures the converging time as an adversarial robustness metric. We validate the effectiveness and generalization of the proposed ACTS metric against different adversarial attacks on the large-scale ImageNet dataset using state-of-the-art deep networks. Extensive experiments show that our ACTS metric is an efficient and effective adversarial metric over the previous CLEVER metric.

* ACM Transactions on Multimedia Computing, Communications, and Applications (ACM TOMM)

Via

Access Paper or Ask Questions

Compressing Context to Enhance Inference Efficiency of Large Language Models

Oct 09, 2023

Yucheng Li, Bo Dong, Chenghua Lin, Frank Guerin

Abstract:Large language models (LLMs) achieved remarkable performance across various tasks. However, they face challenges in managing long documents and extended conversations, due to significantly increased computational requirements, both in memory and inference time, and potential context truncation when the input exceeds the LLM's fixed context length. This paper proposes a method called Selective Context that enhances the inference efficiency of LLMs by identifying and pruning redundancy in the input context to make the input more compact. We test our approach using common data sources requiring long context processing: arXiv papers, news articles, and long conversations, on tasks of summarisation, question answering, and response generation. Experimental results show that Selective Context significantly reduces memory cost and decreases generation latency while maintaining comparable performance compared to that achieved when full context is used. Specifically, we achieve a 50\% reduction in context cost, resulting in a 36\% reduction in inference memory usage and a 32\% reduction in inference time, while observing only a minor drop of .023 in BERTscore and .038 in faithfulness on four downstream applications, indicating that our method strikes a good balance between efficiency and performance.

* EMNLP 2023. arXiv admin note: substantial text overlap with arXiv:2304.12102; text overlap with arXiv:2303.11076 by other authors

Via

Access Paper or Ask Questions

In the Blink of an Eye: Event-based Emotion Recognition

Oct 06, 2023

Haiwei Zhang, Jiqing Zhang, Bo Dong, Pieter Peers, Wenwei Wu, Xiaopeng Wei, Felix Heide, Xin Yang

Figure 1 for In the Blink of an Eye: Event-based Emotion Recognition

Figure 2 for In the Blink of an Eye: Event-based Emotion Recognition

Figure 3 for In the Blink of an Eye: Event-based Emotion Recognition

Figure 4 for In the Blink of an Eye: Event-based Emotion Recognition

Abstract:We introduce a wearable single-eye emotion recognition device and a real-time approach to recognizing emotions from partial observations of an emotion that is robust to changes in lighting conditions. At the heart of our method is a bio-inspired event-based camera setup and a newly designed lightweight Spiking Eye Emotion Network (SEEN). Compared to conventional cameras, event-based cameras offer a higher dynamic range (up to 140 dB vs. 80 dB) and a higher temporal resolution. Thus, the captured events can encode rich temporal cues under challenging lighting conditions. However, these events lack texture information, posing problems in decoding temporal information effectively. SEEN tackles this issue from two different perspectives. First, we adopt convolutional spiking layers to take advantage of the spiking neural network's ability to decode pertinent temporal information. Second, SEEN learns to extract essential spatial cues from corresponding intensity frames and leverages a novel weight-copy scheme to convey spatial attention to the convolutional spiking layers during training and inference. We extensively validate and demonstrate the effectiveness of our approach on a specially collected Single-eye Event-based Emotion (SEE) dataset. To the best of our knowledge, our method is the first eye-based emotion recognition method that leverages event-based cameras and spiking neural network.

* Special Interest Group for Computer GRAPHICS,2023

Via

Access Paper or Ask Questions