Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Liangxin Qian

Private Transformer Inference in MLaaS: A Survey

May 15, 2025

Yang Li, Xinyu Zhou, Yitong Wang, Liangxin Qian, Jun Zhao

Abstract:Transformer models have revolutionized AI, powering applications like content generation and sentiment analysis. However, their deployment in Machine Learning as a Service (MLaaS) raises significant privacy concerns, primarily due to the centralized processing of sensitive user data. Private Transformer Inference (PTI) offers a solution by utilizing cryptographic techniques such as secure multi-party computation and homomorphic encryption, enabling inference while preserving both user data and model privacy. This paper reviews recent PTI advancements, highlighting state-of-the-art solutions and challenges. We also introduce a structured taxonomy and evaluation framework for PTI, focusing on balancing resource efficiency with privacy and bridging the gap between high-performance inference and data privacy.

Via

Access Paper or Ask Questions

A Survey on Private Transformer Inference

Dec 11, 2024

Yang Li, Xinyu Zhou, Yitong Wang, Liangxin Qian, Jun Zhao

Abstract:Transformer models have revolutionized AI, enabling applications like content generation and sentiment analysis. However, their use in Machine Learning as a Service (MLaaS) raises significant privacy concerns, as centralized servers process sensitive user data. Private Transformer Inference (PTI) addresses these issues using cryptographic techniques such as Secure Multi-Party Computation (MPC) and Homomorphic Encryption (HE), enabling secure model inference without exposing inputs or models. This paper reviews recent advancements in PTI, analyzing state-of-the-art solutions, their challenges, and potential improvements. We also propose evaluation guidelines to assess resource efficiency and privacy guarantees, aiming to bridge the gap between high-performance inference and data privacy.

* The manuscript is still being revised and will be continuously updated in the future

Via

Access Paper or Ask Questions

Data Processing Efficiency Aware User Association and Resource Allocation in Blockchain Enabled Metaverse over Wireless Communications

Nov 25, 2024

Liangxin Qian, Jun Zhao

Abstract:In the rapidly evolving landscape of the Metaverse, enhanced by blockchain technology, the efficient processing of data has emerged as a critical challenge, especially in wireless communication systems. Addressing this need, our paper introduces the innovative concept of data processing efficiency (DPE), aiming to maximize processed bits per unit of resource consumption in blockchain-empowered Metaverse environments. To achieve this, we propose the DPE-Aware User Association and Resource Allocation (DAUR) algorithm, a tailored solution for these complex systems. The DAUR algorithm transforms the challenging task of optimizing the sum of DPE ratios into a solvable convex optimization problem. It uniquely alternates the optimization of key variables like user association, work offloading ratios, task-specific computing resource distribution, bandwidth allocation, user power usage ratios, and server computing resource allocation ratios. Our extensive numerical results demonstrate the DAUR algorithm's effectiveness in DPE.

* This is the full version of the conference paper published in the Twenty-fifth International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing (MobiHoc 2024). DOI: https://doi.org/10.1145/3641512.3686376. arXiv admin note: text overlap with arXiv:2406.13602

Via

Access Paper or Ask Questions

Parameter Training Efficiency Aware Resource Allocation for AIGC in Space-Air-Ground Integrated Networks

Jun 19, 2024

Liangxin Qian, Jun Zhao

Abstract:With the evolution of artificial intelligence-generated content (AIGC) techniques and the development of space-air-ground integrated networks (SAGIN), there will be a growing opportunity to enhance more users' mobile experience with customized AIGC applications. This is made possible through the use of parameter-efficient fine-tuning (PEFT) training alongside mobile edge computing. In this paper, we formulate the optimization problem of maximizing the parameter training efficiency of the SAGIN system over wireless networks under limited resource constraints. We propose the Parameter training efficiency Aware Resource Allocation (PARA) technique to jointly optimize user association, data offloading, and communication and computational resource allocation. Solid proofs are presented to solve this difficult sum of ratios problem based on quadratically constrained quadratic programming (QCQP), semidefinite programming (SDP), graph theory, and fractional programming (FP) techniques. Our proposed PARA technique is effective in finding a stationary point of this non-convex problem. The simulation results demonstrate that the proposed PARA method outperforms other baselines.

* submitted to a journal

Via

Access Paper or Ask Questions

Self-Evolving Wireless Communications: A Novel Intelligence Trend for 6G and Beyond

Apr 07, 2024

Liangxin Qian, Ping Yang, Jun Zhao, Ze Chen, Wanbin Tang

Abstract:Wireless communication is rapidly evolving, and future wireless communications (6G and beyond) will be more heterogeneous, multi-layered, and complex, which poses challenges to traditional communications. Adaptive technologies in traditional communication systems respond to environmental changes by modifying system parameters and structures on their own and are not flexible and agile enough to satisfy requirements in future communications. To tackle these challenges, we propose a novel self-evolving communication framework, which consists of three layers: data layer, information layer, and knowledge layer. The first two layers allow communication systems to sense environments, fuse data, and generate a knowledge base for the knowledge layer. When dealing with a variety of application scenarios and environments, the generated knowledge is subsequently fed back to the first two layers for communication in practical application scenarios to obtain self-evolving ability and enhance the robustness of the system. In this paper, we first highlight the limitations of current adaptive communication systems and the need for intelligence, automation, and self-evolution in future wireless communications. We overview the development of self-evolving technologies and conceive the concept of self-evolving communications with its hypothetical architecture. To demonstrate the power of self-evolving modules, we compare the performances of a communication system with and without evolution. We then provide some potential techniques that enable self-evolving communications and challenges in implementing them.

Via

Access Paper or Ask Questions

User Connection and Resource Allocation Optimization in Blockchain Empowered Metaverse over 6G Wireless Communications

Mar 08, 2024

Liangxin Qian, Chang Liu, Jun Zhao

Abstract:The convergence of blockchain, Metaverse, and non-fungible tokens (NFTs) brings transformative digital opportunities alongside challenges like privacy and resource management. Addressing these, we focus on optimizing user connectivity and resource allocation in an NFT-centric and blockchain-enabled Metaverse in this paper. Through user work-offloading, we optimize data tasks, user connection parameters, and server computing frequency division. In the resource allocation phase, we optimize communication-computation resource distributions, including bandwidth, transmit power, and computing frequency. We introduce the trust-cost ratio (TCR), a pivotal measure combining trust scores from users' resources and server history with delay and energy costs. This balance ensures sustained user engagement and trust. The DASHF algorithm, central to our approach, encapsulates the Dinkelbach algorithm, alternating optimization, semidefinite relaxation (SDR), the Hungarian method, and a novel fractional programming technique from a recent IEEE JSAC paper [2]. The most challenging part of DASHF is to rewrite an optimization problem as Quadratically Constrained Quadratic Programming (QCQP) via carefully designed transformations, in order to be solved by SDR and the Hungarian algorithm. Extensive simulations validate the DASHF algorithm's efficacy, revealing critical insights for enhancing blockchain-Metaverse applications, especially with NFTs.

* IEEE Transactions on Wireless Communications (TWC), revision submitted. Full version of arXiv:2310.17872

Via

Access Paper or Ask Questions

User Association and Resource Allocation in Large Language Model Based Mobile Edge Computing System over Wireless Communications

Oct 27, 2023

Liangxin Qian, Jun Zhao

Abstract:In the rapidly evolving landscape of large language models (LLMs) and mobile edge computing, the need for efficient service delivery to mobile users with constrained computational resources has become paramount. Addressing this, our paper delves into a collaborative framework for model training where user data and model adapters are shared with servers to optimize performance. Within this framework, users initially update the first several layers of the adapters while freezing the other layers of them, leveraging their local datasets. Once this step is complete, these partially trained parameters are transmitted to servers. The servers, equipped with more robust computational capabilities, then update the subsequent layers. After this training, they send the enhanced parameters back to the users. This collaborative training approach ensures that mobile users with limited computational capacities can still benefit from advanced LLM services without being burdened by exhaustive computations. Central to our methodology is the DASHF algorithm, which encapsulates the Dinkelbach algorithm, alternating optimization, semidefinite relaxation (SDR), the Hungarian method, and a pioneering fractional programming technique from our recent IEEE JSAC paper "Human-Centric Resource Allocation in the Metaverse over Wireless Communications". The crux of DASHF is its capability to reformulate an optimization problem as Quadratically Constrained Quadratic Programming (QCQP) via meticulously crafted transformations, making it solvable by SDR and the Hungarian algorithm. Through extensive simulations, we demonstrate the effectiveness of the DASHF algorithm, offering significant insights for the advancement of collaborative LLM service deployments.

Via

Access Paper or Ask Questions

A Hybrid Framework of Reinforcement Learning and Convex Optimization for UAV-Based Autonomous Metaverse Data Collection

May 29, 2023

Peiyuan Si, Liangxin Qian, Jun Zhao, Kwok-Yan Lam

Abstract:Unmanned aerial vehicles (UAVs) are promising for providing communication services due to their advantages in cost and mobility, especially in the context of the emerging Metaverse and Internet of Things (IoT). This paper considers a UAV-assisted Metaverse network, in which UAVs extend the coverage of the base station (BS) to collect the Metaverse data generated at roadside units (RSUs). Specifically, to improve the data collection efficiency, resource allocation and trajectory control are integrated into the system model. The time-dependent nature of the optimization problem makes it non-trivial to be solved by traditional convex optimization methods. Based on the proposed UAV-assisted Metaverse network system model, we design a hybrid framework with reinforcement learning and convex optimization to {cooperatively} solve the time-sequential optimization problem. Simulation results show that the proposed framework is able to reduce the mission completion time with a given transmission power resource.

* This paper appears in IEEE Network magazine

Via

Access Paper or Ask Questions