Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yingguang Yang

Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks

May 22, 2025

Hongyuan Tao, Ying Zhang, Zhenhao Tang, Hongen Peng, Xukun Zhu, Bingchang Liu, Yingguang Yang, Ziyin Zhang, Zhaogui Xu, Haipeng Zhang(+5 more)

Abstract:Recent advances in Large Language Models (LLMs) have shown promise in function-level code generation, yet repository-level software engineering tasks remain challenging. Current solutions predominantly rely on proprietary LLM agents, which introduce unpredictability and limit accessibility, raising concerns about data privacy and model customization. This paper investigates whether open-source LLMs can effectively address repository-level tasks without requiring agent-based approaches. We demonstrate this is possible by enabling LLMs to comprehend functions and files within codebases through their semantic information and structural dependencies. To this end, we introduce Code Graph Models (CGMs), which integrate repository code graph structures into the LLM's attention mechanism and map node attributes to the LLM's input space using a specialized adapter. When combined with an agentless graph RAG framework, our approach achieves a 43.00% resolution rate on the SWE-bench Lite benchmark using the open-source Qwen2.5-72B model. This performance ranks first among open weight models, second among methods with open-source systems, and eighth overall, surpassing the previous best open-source model-based method by 12.33%.

* 31 pages, 9 figures

Via

Access Paper or Ask Questions

Certainly Bot Or Not? Trustworthy Social Bot Detection via Robust Multi-Modal Neural Processes

Mar 11, 2025

Qi Wu, Yingguang Yang, hao liu, Hao Peng, Buyun He, Yutong Xia, Yong Liao

Abstract:Social bot detection is crucial for mitigating misinformation, online manipulation, and coordinated inauthentic behavior. While existing neural network-based detectors perform well on benchmarks, they struggle with generalization due to distribution shifts across datasets and frequently produce overconfident predictions for out-of-distribution accounts beyond the training data. To address this, we introduce a novel Uncertainty Estimation for Social Bot Detection (UESBD) framework, which quantifies the predictive uncertainty of detectors beyond mere classification. For this task, we propose Robust Multi-modal Neural Processes (RMNP), which aims to enhance the robustness of multi-modal neural processes to modality inconsistencies caused by social bot camouflage. RMNP first learns unimodal representations through modality-specific encoders. Then, unimodal attentive neural processes are employed to encode the Gaussian distribution of unimodal latent variables. Furthermore, to avoid social bots stealing human features to camouflage themselves thus causing certain modalities to provide conflictive information, we introduce an evidential gating network to explicitly model the reliability of modalities. The joint latent distribution is learned through the generalized product of experts, which takes the reliability of each modality into consideration during fusion. The final prediction is obtained through Monte Carlo sampling of the joint latent distribution followed by a decoder. Experiments on three real-world benchmarks show the effectiveness of RMNP in classification and uncertainty estimation, as well as its robustness to modality conflicts.

* 12 pages. 7 figures

Via

Access Paper or Ask Questions

SeBot: Structural Entropy Guided Multi-View Contrastive Learning for Social Bot Detection

May 18, 2024

Yingguang Yang, Qi Wu, Buyun He, Hao Peng, Renyu Yang, Zhifeng Hao, Yong Liao

Figure 1 for SeBot: Structural Entropy Guided Multi-View Contrastive Learning for Social Bot Detection

Figure 2 for SeBot: Structural Entropy Guided Multi-View Contrastive Learning for Social Bot Detection

Figure 3 for SeBot: Structural Entropy Guided Multi-View Contrastive Learning for Social Bot Detection

Figure 4 for SeBot: Structural Entropy Guided Multi-View Contrastive Learning for Social Bot Detection

Abstract:Recent advancements in social bot detection have been driven by the adoption of Graph Neural Networks. The social graph, constructed from social network interactions, contains benign and bot accounts that influence each other. However, previous graph-based detection methods that follow the transductive message-passing paradigm may not fully utilize hidden graph information and are vulnerable to adversarial bot behavior. The indiscriminate message passing between nodes from different categories and communities results in excessively homogeneous node representations, ultimately reducing the effectiveness of social bot detectors. In this paper, we propose SEBot, a novel multi-view graph-based contrastive learning-enabled social bot detector. In particular, we use structural entropy as an uncertainty metric to optimize the entire graph's structure and subgraph-level granularity, revealing the implicitly existing hierarchical community structure. And we design an encoder to enable message passing beyond the homophily assumption, enhancing robustness to adversarial behaviors of social bots. Finally, we employ multi-view contrastive learning to maximize mutual information between different views and enhance the detection performance through multi-task learning. Experimental results demonstrate that our approach significantly improves the performance of social bot detection compared with SOTA methods.

* KDD 2024

Via

Access Paper or Ask Questions

BotDGT: Dynamicity-aware Social Bot Detection with Dynamic Graph Transformers

Apr 24, 2024

Buyun He, Yingguang Yang, Qi Wu, Hao Liu, Renyu Yang, Hao Peng, Xiang Wang, Yong Liao, Pengyuan Zhou

Figure 1 for BotDGT: Dynamicity-aware Social Bot Detection with Dynamic Graph Transformers

Figure 2 for BotDGT: Dynamicity-aware Social Bot Detection with Dynamic Graph Transformers

Figure 3 for BotDGT: Dynamicity-aware Social Bot Detection with Dynamic Graph Transformers

Figure 4 for BotDGT: Dynamicity-aware Social Bot Detection with Dynamic Graph Transformers

Abstract:Detecting social bots has evolved into a pivotal yet intricate task, aimed at combating the dissemination of misinformation and preserving the authenticity of online interactions. While earlier graph-based approaches, which leverage topological structure of social networks, yielded notable outcomes, they overlooked the inherent dynamicity of social networks -- In reality, they largely depicted the social network as a static graph and solely relied on its most recent state. Due to the absence of dynamicity modeling, such approaches are vulnerable to evasion, particularly when advanced social bots interact with other users to camouflage identities and escape detection. To tackle these challenges, we propose BotDGT, a novel framework that not only considers the topological structure, but also effectively incorporates dynamic nature of social network. Specifically, we characterize a social network as a dynamic graph. A structural module is employed to acquire topological information from each historical snapshot. Additionally, a temporal module is proposed to integrate historical context and model the evolving behavior patterns exhibited by social bots and legitimate users. Experimental results demonstrate the superiority of BotDGT against the leading methods that neglected the dynamic nature of social networks in terms of accuracy, recall, and F1-score.

* IJCAI 2024

Via

Access Paper or Ask Questions

FedACK: Federated Adversarial Contrastive Knowledge Distillation for Cross-Lingual and Cross-Model Social Bot Detection

Mar 10, 2023

Yingguang Yang, Renyu Yang, Hao Peng, Yangyang Li, Tong Li, Yong Liao, Pengyuan Zhou

Figure 1 for FedACK: Federated Adversarial Contrastive Knowledge Distillation for Cross-Lingual and Cross-Model Social Bot Detection

Figure 2 for FedACK: Federated Adversarial Contrastive Knowledge Distillation for Cross-Lingual and Cross-Model Social Bot Detection

Figure 3 for FedACK: Federated Adversarial Contrastive Knowledge Distillation for Cross-Lingual and Cross-Model Social Bot Detection

Figure 4 for FedACK: Federated Adversarial Contrastive Knowledge Distillation for Cross-Lingual and Cross-Model Social Bot Detection

Abstract:Social bot detection is of paramount importance to the resilience and security of online social platforms. The state-of-the-art detection models are siloed and have largely overlooked a variety of data characteristics from multiple cross-lingual platforms. Meanwhile, the heterogeneity of data distribution and model architecture makes it intricate to devise an efficient cross-platform and cross-model detection framework. In this paper, we propose FedACK, a new federated adversarial contrastive knowledge distillation framework for social bot detection. We devise a GAN-based federated knowledge distillation mechanism for efficiently transferring knowledge of data distribution among clients. In particular, a global generator is used to extract the knowledge of global data distribution and distill it into each client's local model. We leverage local discriminator to enable customized model design and use local generator for data enhancement with hard-to-decide samples. Local training is conducted as multi-stage adversarial and contrastive learning to enable consistent feature spaces among clients and to constrain the optimization direction of local models, reducing the divergences between local and global models. Experiments demonstrate that FedACK outperforms the state-of-the-art approaches in terms of accuracy, communication efficiency, and feature space consistency.

* Accepted by the ACM Web Conference 2023 (WWW'23)

Via

Access Paper or Ask Questions

RoSGAS: Adaptive Social Bot Detection with Reinforced Self-Supervised GNN Architecture Search

Jun 14, 2022

Yingguang Yang, Renyu Yang, Yangyang Li, Kai Cui, Zhiqin Yang, Yue Wang, Jie Xu, Haiyong Xie

Figure 1 for RoSGAS: Adaptive Social Bot Detection with Reinforced Self-Supervised GNN Architecture Search

Figure 2 for RoSGAS: Adaptive Social Bot Detection with Reinforced Self-Supervised GNN Architecture Search

Figure 3 for RoSGAS: Adaptive Social Bot Detection with Reinforced Self-Supervised GNN Architecture Search

Figure 4 for RoSGAS: Adaptive Social Bot Detection with Reinforced Self-Supervised GNN Architecture Search

Abstract:Social bots are referred to as the automated accounts on social networks that make attempts to behave like human. While Graph Neural Networks (GNNs) has been massively applied to the field of social bot detection, a huge amount of domain expertise and prior knowledge is heavily engaged in the state-of-the art approaches to design a dedicated neural network architecture for a specific classification task. Involving oversized nodes and network layers in the model design, however, usually causes the over-smoothing problem and the lack of embedding discrimination. In this paper, we propose RoSGAS, a novel Reinforced and Self-supervised GNN Architecture Search framework to adaptively pinpoint the most suitable multi-hop neighborhood and the number of layers in the GNN architecture. More specifically, we consider the social bot detection problem as a user-centric subgraph embedding and classification task. We exploit heterogeneous information network to present the user connectivity by leveraging account metadata, relationships, behavioral features and content features. RoSGAS uses a multi-agent deep reinforcement learning (RL) mechanism for navigating the search of optimal neighborhood and network layers to learn individually the subgraph embedding for each target user. A nearest neighbor mechanism is developed for accelerating the RL training process, and RoSGAS can learn more discriminative subgraph embedding with the aid of self-supervised learning. Experiments on 5 Twitter datasets show that RoSGAS outperforms the state-of-the-art approaches in terms of accuracy, training efficiency and stability, and has better generalization when handling unseen samples.

* 32 pages, 12 figures

Via

Access Paper or Ask Questions

Reversible Upper Confidence Bound Algorithm to Generate Diverse Optimized Candidates

Dec 30, 2021

Bin Chong, Yingguang Yang, Zi-Le Wang, Hang Xing, Zhirong Liu

Figure 1 for Reversible Upper Confidence Bound Algorithm to Generate Diverse Optimized Candidates

Figure 2 for Reversible Upper Confidence Bound Algorithm to Generate Diverse Optimized Candidates

Figure 3 for Reversible Upper Confidence Bound Algorithm to Generate Diverse Optimized Candidates

Figure 4 for Reversible Upper Confidence Bound Algorithm to Generate Diverse Optimized Candidates

Abstract:Most algorithms for the multi-armed bandit problem in reinforcement learning aimed to maximize the expected reward, which are thus useful in searching the optimized candidate with the highest reward (function value) for diverse applications (e.g., AlphaGo). However, in some typical application scenaios such as drug discovery, the aim is to search a diverse set of candidates with high reward. Here we propose a reversible upper confidence bound (rUCB) algorithm for such a purpose, and demonstrate its application in virtual screening upon intrinsically disordered proteins (IDPs). It is shown that rUCB greatly reduces the query times while achieving both high accuracy and low performance loss.The rUCB may have potential application in multipoint optimization and other reinforcement-learning cases.

* Phys. Chem. Chem. Phys. 23 (11), 6800-6806 (2021)
* 10 pages, 10 figures

Via

Access Paper or Ask Questions