Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zheng Gao

Are LLMs Better GNN Helpers? Rethinking Robust Graph Learning under Deficiencies with Iterative Refinement

Oct 02, 2025

Zhaoyan Wang, Zheng Gao, Arogya Kharel, In-Young Ko

Abstract:Graph Neural Networks (GNNs) are widely adopted in Web-related applications, serving as a core technique for learning from graph-structured data, such as text-attributed graphs. Yet in real-world scenarios, such graphs exhibit deficiencies that substantially undermine GNN performance. While prior GNN-based augmentation studies have explored robustness against individual imperfections, a systematic understanding of how graph-native and Large Language Models (LLMs) enhanced methods behave under compound deficiencies is still missing. Specifically, there has been no comprehensive investigation comparing conventional approaches and recent LLM-on-graph frameworks, leaving their merits unclear. To fill this gap, we conduct the first empirical study that benchmarks these two lines of methods across diverse graph deficiencies, revealing overlooked vulnerabilities and challenging the assumption that LLM augmentation is consistently superior. Building on empirical findings, we propose Robust Graph Learning via Retrieval-Augmented Contrastive Refinement (RoGRAD) framework. Unlike prior one-shot LLM-as-Enhancer designs, RoGRAD is the first iterative paradigm that leverages Retrieval-Augmented Generation (RAG) to inject retrieval-grounded augmentations by supplying class-consistent, diverse augmentations and enforcing discriminative representations through iterative graph contrastive learning. It transforms LLM augmentation for graphs from static signal injection into dynamic refinement. Extensive experiments demonstrate RoGRAD's superiority over both conventional GNN- and LLM-enhanced baselines, achieving up to 82.43% average improvement.

* 14 pages

Via

Access Paper or Ask Questions

Citation Recommendation based on Argumentative Zoning of User Queries

Jan 30, 2025

Shutian Ma, Chengzhi Zhang, Heng Zhang, Zheng Gao

Abstract:Citation recommendation aims to locate the important papers for scholars to cite. When writing the citing sentences, the authors usually hold different citing intents, which are referred to citation function in citation analysis. Since argumentative zoning is to identify the argumentative and rhetorical structure in scientific literature, we want to use this information to improve the citation recommendation task. In this paper, a multi-task learning model is built for citation recommendation and argumentative zoning classification. We also generated an annotated corpus of the data from PubMed Central based on a new argumentative zoning schema. The experimental results show that, by considering the argumentative information in the citing sentence, citation recommendation model will get better performance.

* Journal of Informetrics, 2025

Via

Access Paper or Ask Questions

Science Out of Its Ivory Tower: Improving Accessibility with Reinforcement Learning

Oct 22, 2024

Haining Wang, Jason Clark, Hannah McKelvey, Leila Sterman, Zheng Gao, Zuoyu Tian, Sandra Kübler, Xiaozhong Liu

Abstract:A vast amount of scholarly work is published daily, yet much of it remains inaccessible to the general public due to dense jargon and complex language. To address this challenge in science communication, we introduce a reinforcement learning framework that fine-tunes a language model to rewrite scholarly abstracts into more comprehensible versions. Guided by a carefully balanced combination of word- and sentence-level accessibility rewards, our language model effectively substitutes technical terms with more accessible alternatives, a task which models supervised fine-tuned or guided by conventional readability measures struggle to accomplish. Our best model adjusts the readability level of scholarly abstracts by approximately six U.S. grade levels -- in other words, from a postgraduate to a high school level. This translates to roughly a 90% relative boost over the supervised fine-tuning baseline, all while maintaining factual accuracy and high-quality language. An in-depth analysis of our approach shows that balanced rewards lead to systematic modifications in the base model, likely contributing to smoother optimization and superior performance. We envision this work as a step toward bridging the gap between scholarly research and the general public, particularly younger readers and those without a college degree.

Via

Access Paper or Ask Questions

Self-Supervised Facial Representation Learning with Facial Region Awareness

Mar 04, 2024

Zheng Gao, Ioannis Patras

Abstract:Self-supervised pre-training has been proved to be effective in learning transferable representations that benefit various visual tasks. This paper asks this question: can self-supervised pre-training learn general facial representations for various facial analysis tasks? Recent efforts toward this goal are limited to treating each face image as a whole, i.e., learning consistent facial representations at the image-level, which overlooks the consistency of local facial representations (i.e., facial regions like eyes, nose, etc). In this work, we make a first attempt to propose a novel self-supervised facial representation learning framework to learn consistent global and local facial representations, Facial Region Awareness (FRA). Specifically, we explicitly enforce the consistency of facial regions by matching the local facial representations across views, which are extracted with learned heatmaps highlighting the facial regions. Inspired by the mask prediction in supervised semantic segmentation, we obtain the heatmaps via cosine similarity between the per-pixel projection of feature maps and facial mask embeddings computed from learnable positional embeddings, which leverage the attention mechanism to globally look up the facial image for facial regions. To learn such heatmaps, we formulate the learning of facial mask embeddings as a deep clustering problem by assigning the pixel features from the feature maps to them. The transfer learning results on facial classification and regression tasks show that our FRA outperforms previous pre-trained models and more importantly, using ResNet as the unified backbone for various tasks, our FRA achieves comparable or even better performance compared with SOTA methods in facial analysis tasks.

Via

Access Paper or Ask Questions

Self-Supervised Representation Learning with Cross-Context Learning between Global and Hypercolumn Features

Sep 01, 2023

Zheng Gao, Chen Feng, Ioannis Patras

Abstract:Whilst contrastive learning yields powerful representations by matching different augmented views of the same instance, it lacks the ability to capture the similarities between different instances. One popular way to address this limitation is by learning global features (after the global pooling) to capture inter-instance relationships based on knowledge distillation, where the global features of the teacher are used to guide the learning of the global features of the student. Inspired by cross-modality learning, we extend this existing framework that only learns from global features by encouraging the global features and intermediate layer features to learn from each other. This leads to our novel self-supervised framework: cross-context learning between global and hypercolumn features (CGH), that enforces the consistency of instance relations between low- and high-level semantics. Specifically, we stack the intermediate feature maps to construct a hypercolumn representation so that we can measure instance relations using two contexts (hypercolumn and global feature) separately, and then use the relations of one context to guide the learning of the other. This cross-context learning allows the model to learn from the differences between the two contexts. The experimental results on linear classification and downstream tasks show that our method outperforms the state-of-the-art methods.

Via

Access Paper or Ask Questions

FrFT based estimation of linear and nonlinear impairments using Vision Transformer

Aug 25, 2023

Ting Jiang, Zheng Gao, Yizhao Chen, Zihe Hu, Ming Tang

Figure 1 for FrFT based estimation of linear and nonlinear impairments using Vision Transformer

Figure 2 for FrFT based estimation of linear and nonlinear impairments using Vision Transformer

Figure 3 for FrFT based estimation of linear and nonlinear impairments using Vision Transformer

Figure 4 for FrFT based estimation of linear and nonlinear impairments using Vision Transformer

Abstract:To comprehensively assess optical fiber communication system conditions, it is essential to implement joint estimation of the following four critical impairments: nonlinear signal-to-noise ratio (SNRNL), optical signal-to-noise ratio (OSNR), chromatic dispersion (CD) and differential group delay (DGD). However, current studies only achieve identifying a limited number of impairments within a narrow range, due to limitations in network capabilities and lack of unified representation of impairments. To address these challenges, we adopt time-frequency signal processing based on fractional Fourier transform (FrFT) to achieve the unified representation of impairments, while employing a Transformer based neural networks (NN) to break through network performance limitations. To verify the effectiveness of the proposed estimation method, the numerical simulation is carried on a 5-channel polarization-division-multiplexed quadrature phase shift keying (PDM-QPSK) long haul optical transmission system with the symbol rate of 50 GBaud per channel, the mean absolute error (MAE) for SNRNL, OSNR, CD, and DGD estimation is 0.091 dB, 0.058 dB, 117 ps/nm, and 0.38 ps, and the monitoring window ranges from 0~20 dB, 10~30 dB, 0~51000 ps/nm, and 0~100 ps, respectively. Our proposed method achieves accurate estimation of linear and nonlinear impairments over a broad range, representing a significant advancement in the field of optical performance monitoring (OPM).

* 15 pages, 10 figures

Via

Access Paper or Ask Questions

Two-stage Voice Application Recommender System for Unhandled Utterances in Intelligent Personal Assistant

Oct 19, 2021

Wei Xiao, Qian Hu, Thahir Mohamed, Zheng Gao, Xibin Gao, Radhika Arava, Mohamed AbdelHady

Figure 1 for Two-stage Voice Application Recommender System for Unhandled Utterances in Intelligent Personal Assistant

Figure 2 for Two-stage Voice Application Recommender System for Unhandled Utterances in Intelligent Personal Assistant

Figure 3 for Two-stage Voice Application Recommender System for Unhandled Utterances in Intelligent Personal Assistant

Figure 4 for Two-stage Voice Application Recommender System for Unhandled Utterances in Intelligent Personal Assistant

Abstract:Intelligent personal assistants (IPA) enable voice applications that facilitate people's daily tasks. However, due to the complexity and ambiguity of voice requests, some requests may not be handled properly by the standard natural language understanding (NLU) component. In such cases, a simple reply like "Sorry, I don't know" hurts the user's experience and limits the functionality of IPA. In this paper, we propose a two-stage shortlister-reranker recommender system to match third-party voice applications (skills) to unhandled utterances. In this approach, a skill shortlister is proposed to retrieve candidate skills from the skill catalog by calculating both lexical and semantic similarity between skills and user requests. We also illustrate how to build a new system by using observed data collected from a baseline rule-based system, and how the exposure biases can generate discrepancy between offline and human metrics. Lastly, we present two relabeling methods that can handle the incomplete ground truth, and mitigate exposure bias. We demonstrate the effectiveness of our proposed system through extensive offline experiments. Furthermore, we present online A/B testing results that show a significant boost on user experience satisfaction.

* 9 pages, IRS KDD workshop 2021

Via

Access Paper or Ask Questions

Improving Community Detection Performance in Heterogeneous Music Network by Learning Edge-type Usefulness Distribution

May 03, 2021

Zheng Gao, Chun Guo, Xiaozhong Liu

Figure 1 for Improving Community Detection Performance in Heterogeneous Music Network by Learning Edge-type Usefulness Distribution

Figure 2 for Improving Community Detection Performance in Heterogeneous Music Network by Learning Edge-type Usefulness Distribution

Figure 3 for Improving Community Detection Performance in Heterogeneous Music Network by Learning Edge-type Usefulness Distribution

Figure 4 for Improving Community Detection Performance in Heterogeneous Music Network by Learning Edge-type Usefulness Distribution

Abstract:Music is becoming an essential part of daily life. There is an urgent need to develop recommendation systems to assist people targeting better songs with fewer efforts. As the interactions between users and songs naturally construct a complex network, community detection approaches can be applied to reveal users' potential interests on songs by grouping relevant users \& songs to the same community. However, as the types of interaction are diverse, it challenges conventional community detection methods which are designed originally for homogeneous networks. Although there are existing works focusing on heterogeneous community detection, they are mostly task-driven approaches and not feasible for music retrieval and recommendation directly. In this paper, we propose a genetic based approach to learn an edge-type usefulness distribution (ETUD) for all edge-types in heterogeneous music networks. ETUD can be regarded as a linear function to project all edges to the same latent space and make them comparable. Therefore a heterogeneous network can be converted to a homogeneous one where those conventional methods are eligible to use. We validate the proposed model on a heterogeneous music network constructed from an online music streaming service. Results show that for conventional methods, ETUD can help to detect communities significantly improving music recommendation accuracy while reducing user searching cost simultaneously.

Via

Access Paper or Ask Questions

Typilus: Neural Type Hints

Apr 06, 2020

Miltiadis Allamanis, Earl T. Barr, Soline Ducousso, Zheng Gao

Abstract:Type inference over partial contexts in dynamically typed languages is challenging. In this work, we present a graph neural network model that predicts types by probabilistically reasoning over a program's structure, names, and patterns. The network uses deep similarity learning to learn a TypeSpace -- a continuous relaxation of the discrete space of types -- and how to embed the type properties of a symbol (i.e. identifier) into it. Importantly, our model can employ one-shot learning to predict an open vocabulary of types, including rare and user-defined ones. We realise our approach in Typilus for Python that combines the TypeSpace with an optional type checker. We show that Typilus accurately predicts types. Typilus confidently predicts types for 70% of all annotatable symbols; when it predicts a type, that type optionally type checks 95% of the time. Typilus can also find incorrect type annotations; two important and popular open source libraries, fairseq and allennlp, accepted our pull requests that fixed the annotation errors Typilus discovered.

* Accepted to PLDI 2020

Via

Access Paper or Ask Questions

AMAD: Adversarial Multiscale Anomaly Detection on High-Dimensional and Time-Evolving Categorical Data

Jul 12, 2019

Zheng Gao, Lin Guo, Chi Ma, Xiao Ma, Kai Sun, Hang Xiang, Xiaoqiang Zhu, Hongsong Li, Xiaozhong Liu

Figure 1 for AMAD: Adversarial Multiscale Anomaly Detection on High-Dimensional and Time-Evolving Categorical Data

Figure 2 for AMAD: Adversarial Multiscale Anomaly Detection on High-Dimensional and Time-Evolving Categorical Data

Figure 3 for AMAD: Adversarial Multiscale Anomaly Detection on High-Dimensional and Time-Evolving Categorical Data

Figure 4 for AMAD: Adversarial Multiscale Anomaly Detection on High-Dimensional and Time-Evolving Categorical Data

Abstract:Anomaly detection is facing with emerging challenges in many important industry domains, such as cyber security and online recommendation and advertising. The recent trend in these areas calls for anomaly detection on time-evolving data with high-dimensional categorical features without labeled samples. Also, there is an increasing demand for identifying and monitoring irregular patterns at multiple resolutions. In this work, we propose a unified end-to-end approach to solve these challenges by combining the advantages of Adversarial Autoencoder and Recurrent Neural Network. The model learns data representations cross different scales with attention mechanisms, on which an enhanced two-resolution anomaly detector is developed for both instances and data blocks. Extensive experiments are performed over three types of datasets to demonstrate the efficacy of our method and its superiority over the state-of-art approaches.

* Accepted by 2019 KDD Workshop on Deep Learning Practice for High-Dimensional Sparse Data

Via

Access Paper or Ask Questions