Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tong Qiao

DNF: Dual-Layer Nested Fingerprinting for Large Language Model Intellectual Property Protection

Jan 13, 2026

Zhenhua Xu, Yiran Zhao, Mengting Zhong, Dezhang Kong, Changting Lin, Tong Qiao, Meng Han

Abstract:The rapid growth of large language models raises pressing concerns about intellectual property protection under black-box deployment. Existing backdoor-based fingerprints either rely on rare tokens -- leading to high-perplexity inputs susceptible to filtering -- or use fixed trigger-response mappings that are brittle to leakage and post-hoc adaptation. We propose \textsc{Dual-Layer Nested Fingerprinting} (DNF), a black-box method that embeds a hierarchical backdoor by coupling domain-specific stylistic cues with implicit semantic triggers. Across Mistral-7B, LLaMA-3-8B-Instruct, and Falcon3-7B-Instruct, DNF achieves perfect fingerprint activation while preserving downstream utility. Compared with existing methods, it uses lower-perplexity triggers, remains undetectable under fingerprint detection attacks, and is relatively robust to incremental fine-tuning and model merging. These results position DNF as a practical, stealthy, and resilient solution for LLM ownership verification and intellectual property protection.

Via

Access Paper or Ask Questions

Black-Box Guardrail Reverse-engineering Attack

Nov 06, 2025

Hongwei Yao, Yun Xia, Shuo Shao, Haoran Shi, Tong Qiao, Cong Wang

Abstract:Large language models (LLMs) increasingly employ guardrails to enforce ethical, legal, and application-specific constraints on their outputs. While effective at mitigating harmful responses, these guardrails introduce a new class of vulnerabilities by exposing observable decision patterns. In this work, we present the first study of black-box LLM guardrail reverse-engineering attacks. We propose Guardrail Reverse-engineering Attack (GRA), a reinforcement learning-based framework that leverages genetic algorithm-driven data augmentation to approximate the decision-making policy of victim guardrails. By iteratively collecting input-output pairs, prioritizing divergence cases, and applying targeted mutations and crossovers, our method incrementally converges toward a high-fidelity surrogate of the victim guardrail. We evaluate GRA on three widely deployed commercial systems, namely ChatGPT, DeepSeek, and Qwen3, and demonstrate that it achieves an rule matching rate exceeding 0.92 while requiring less than $85 in API costs. These findings underscore the practical feasibility of guardrail extraction and highlight significant security risks for current LLM safety mechanisms. Our findings expose critical vulnerabilities in current guardrail designs and highlight the urgent need for more robust defense mechanisms in LLM deployment.

Via

Access Paper or Ask Questions

HGNAS: Hardware-Aware Graph Neural Architecture Search for Edge Devices

Aug 23, 2024

Ao Zhou, Jianlei Yang, Yingjie Qi, Tong Qiao, Yumeng Shi, Cenlin Duan, Weisheng Zhao, Chunming Hu

Figure 1 for HGNAS: Hardware-Aware Graph Neural Architecture Search for Edge Devices

Figure 2 for HGNAS: Hardware-Aware Graph Neural Architecture Search for Edge Devices

Figure 3 for HGNAS: Hardware-Aware Graph Neural Architecture Search for Edge Devices

Figure 4 for HGNAS: Hardware-Aware Graph Neural Architecture Search for Edge Devices

Abstract:Graph Neural Networks (GNNs) are becoming increasingly popular for graph-based learning tasks such as point cloud processing due to their state-of-the-art (SOTA) performance. Nevertheless, the research community has primarily focused on improving model expressiveness, lacking consideration of how to design efficient GNN models for edge scenarios with real-time requirements and limited resources. Examining existing GNN models reveals varied execution across platforms and frequent Out-Of-Memory (OOM) problems, highlighting the need for hardware-aware GNN design. To address this challenge, this work proposes a novel hardware-aware graph neural architecture search framework tailored for resource constraint edge devices, namely HGNAS. To achieve hardware awareness, HGNAS integrates an efficient GNN hardware performance predictor that evaluates the latency and peak memory usage of GNNs in milliseconds. Meanwhile, we study GNN memory usage during inference and offer a peak memory estimation method, enhancing the robustness of architecture evaluations when combined with predictor outcomes. Furthermore, HGNAS constructs a fine-grained design space to enable the exploration of extreme performance architectures by decoupling the GNN paradigm. In addition, the multi-stage hierarchical search strategy is leveraged to facilitate the navigation of huge candidates, which can reduce the single search time to a few GPU hours. To the best of our knowledge, HGNAS is the first automated GNN design framework for edge devices, and also the first work to achieve hardware awareness of GNNs across different platforms. Extensive experiments across various applications and edge devices have proven the superiority of HGNAS. It can achieve up to a 10.6x speedup and an 82.5% peak memory reduction with negligible accuracy loss compared to DGCNN on ModelNet40.

* Accepted by IEEE Transactions on Computers

Via

Access Paper or Ask Questions

GNNavigator: Towards Adaptive Training of Graph Neural Networks via Automatic Guideline Exploration

Apr 15, 2024

Tong Qiao, Jianlei Yang, Yingjie Qi, Ao Zhou, Chen Bai, Bei Yu, Weisheng Zhao, Chunming Hu

Abstract:Graph Neural Networks (GNNs) succeed significantly in many applications recently. However, balancing GNNs training runtime cost, memory consumption, and attainable accuracy for various applications is non-trivial. Previous training methodologies suffer from inferior adaptability and lack a unified training optimization solution. To address the problem, this work proposes GNNavigator, an adaptive GNN training configuration optimization framework. GNNavigator meets diverse GNN application requirements due to our unified software-hardware co-abstraction, proposed GNNs training performance model, and practical design space exploration solution. Experimental results show that GNNavigator can achieve up to 3.1x speedup and 44.9% peak memory reduction with comparable accuracy to state-of-the-art approaches.

* Accepted by DAC'24

Via

Access Paper or Ask Questions

Graph Neural Networks Automated Design and Deployment on Device-Edge Co-Inference Systems

Apr 08, 2024

Ao Zhou, Jianlei Yang, Tong Qiao, Yingjie Qi, Zhi Yang, Weisheng Zhao, Chunming Hu

Figure 1 for Graph Neural Networks Automated Design and Deployment on Device-Edge Co-Inference Systems

Figure 2 for Graph Neural Networks Automated Design and Deployment on Device-Edge Co-Inference Systems

Figure 3 for Graph Neural Networks Automated Design and Deployment on Device-Edge Co-Inference Systems

Figure 4 for Graph Neural Networks Automated Design and Deployment on Device-Edge Co-Inference Systems

Abstract:The key to device-edge co-inference paradigm is to partition models into computation-friendly and computation-intensive parts across the device and the edge, respectively. However, for Graph Neural Networks (GNNs), we find that simply partitioning without altering their structures can hardly achieve the full potential of the co-inference paradigm due to various computational-communication overheads of GNN operations over heterogeneous devices. We present GCoDE, the first automatic framework for GNN that innovatively Co-designs the architecture search and the mapping of each operation on Device-Edge hierarchies. GCoDE abstracts the device communication process into an explicit operation and fuses the search of architecture and the operations mapping in a unified space for joint-optimization. Also, the performance-awareness approach, utilized in the constraint-based search process of GCoDE, enables effective evaluation of architecture efficiency in diverse heterogeneous systems. We implement the co-inference engine and runtime dispatcher in GCoDE to enhance the deployment efficiency. Experimental results show that GCoDE can achieve up to $44.9\times$ speedup and $98.2\%$ energy reduction compared to existing approaches across various applications and system configurations.

* Accepted by DAC'24

Via

Access Paper or Ask Questions

Architectural Implications of GNN Aggregation Programming Abstractions

Oct 21, 2023

Yingjie Qi, Jianlei Yang, Ao Zhou, Tong Qiao, Chunming Hu

Figure 1 for Architectural Implications of GNN Aggregation Programming Abstractions

Figure 2 for Architectural Implications of GNN Aggregation Programming Abstractions

Figure 3 for Architectural Implications of GNN Aggregation Programming Abstractions

Figure 4 for Architectural Implications of GNN Aggregation Programming Abstractions

Abstract:Graph neural networks (GNNs) have gained significant popularity due to the powerful capability to extract useful representations from graph data. As the need for efficient GNN computation intensifies, a variety of programming abstractions designed for optimizing GNN Aggregation have emerged to facilitate acceleration. However, there is no comprehensive evaluation and analysis upon existing abstractions, thus no clear consensus on which approach is better. In this letter, we classify existing programming abstractions for GNN Aggregation by the dimension of data organization and propagation method. By constructing these abstractions on a state-of-the-art GNN library, we perform a thorough and detailed characterization study to compare their performance and efficiency, and provide several insights on future GNN acceleration based on our analysis.

* 4 pages, to be published in IEEE Computer Architecture Letters (CAL)

Via

Access Paper or Ask Questions

Hardware-Aware Graph Neural Network Automated Design for Edge Computing Platforms

Mar 20, 2023

Ao Zhou, Jianlei Yang, Yingjie Qi, Yumeng Shi, Tong Qiao, Weisheng Zhao, Chunming Hu

Figure 1 for Hardware-Aware Graph Neural Network Automated Design for Edge Computing Platforms

Figure 2 for Hardware-Aware Graph Neural Network Automated Design for Edge Computing Platforms

Figure 3 for Hardware-Aware Graph Neural Network Automated Design for Edge Computing Platforms

Figure 4 for Hardware-Aware Graph Neural Network Automated Design for Edge Computing Platforms

Abstract:Graph neural networks (GNNs) have emerged as a popular strategy for handling non-Euclidean data due to their state-of-the-art performance. However, most of the current GNN model designs mainly focus on task accuracy, lacking in considering hardware resources limitation and real-time requirements of edge application scenarios. Comprehensive profiling of typical GNN models indicates that their execution characteristics are significantly affected across different computing platforms, which demands hardware awareness for efficient GNN designs. In this work, HGNAS is proposed as the first Hardware-aware Graph Neural Architecture Search framework targeting resource constraint edge devices. By decoupling the GNN paradigm, HGNAS constructs a fine-grained design space and leverages an efficient multi-stage search strategy to explore optimal architectures within a few GPU hours. Moreover, HGNAS achieves hardware awareness during the GNN architecture design by leveraging a hardware performance predictor, which could balance the GNN model accuracy and efficiency corresponding to the characteristics of targeted devices. Experimental results show that HGNAS can achieve about $10.6\times$ speedup and $88.2\%$ peak memory reduction with a negligible accuracy loss compared to DGCNN on various edge devices, including Nvidia RTX3080, Jetson TX2, Intel i7-8700K and Raspberry Pi 3B+.

* Accepted by DAC'23

Via

Access Paper or Ask Questions

Optimizing Memory Efficiency of Graph Neural Networks on Edge Computing Platforms

Apr 12, 2021

Ao Zhou, Jianlei Yang, Yeqi Gao, Tong Qiao, Yingjie Qi, Xiaoyi Wang, Yunli Chen, Pengcheng Dai, Weisheng Zhao, Chunming Hu

Figure 1 for Optimizing Memory Efficiency of Graph Neural Networks on Edge Computing Platforms

Figure 2 for Optimizing Memory Efficiency of Graph Neural Networks on Edge Computing Platforms

Figure 3 for Optimizing Memory Efficiency of Graph Neural Networks on Edge Computing Platforms

Figure 4 for Optimizing Memory Efficiency of Graph Neural Networks on Edge Computing Platforms

Abstract:Graph neural networks (GNN) have achieved state-of-the-art performance on various industrial tasks. However, the poor efficiency of GNN inference and frequent Out-Of-Memory (OOM) problem limit the successful application of GNN on edge computing platforms. To tackle these problems, a feature decomposition approach is proposed for memory efficiency optimization of GNN inference. The proposed approach could achieve outstanding optimization on various GNN models, covering a wide range of datasets, which speeds up the inference by up to 3x. Furthermore, the proposed feature decomposition could significantly reduce the peak memory usage (up to 5x in memory efficiency improvement) and mitigate OOM problems during GNN inference.

* This paper has been accepted by RTAS 2021(brief industry track), with link to publicly available code

Via

Access Paper or Ask Questions