Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tingting Yang

Wireless Large AI Model: Shaping the AI-Native Future of 6G and Beyond

Apr 20, 2025

Fenghao Zhu, Xinquan Wang, Xinyi Li, Maojun Zhang, Yixuan Chen, Chongwen Huang, Zhaohui Yang, Xiaoming Chen, Zhaoyang Zhang, Richeng Jin(+13 more)

Abstract:The emergence of sixth-generation and beyond communication systems is expected to fundamentally transform digital experiences through introducing unparalleled levels of intelligence, efficiency, and connectivity. A promising technology poised to enable this revolutionary vision is the wireless large AI model (WLAM), characterized by its exceptional capabilities in data processing, inference, and decision-making. In light of these remarkable capabilities, this paper provides a comprehensive survey of WLAM, elucidating its fundamental principles, diverse applications, critical challenges, and future research opportunities. We begin by introducing the background of WLAM and analyzing the key synergies with wireless networks, emphasizing the mutual benefits. Subsequently, we explore the foundational characteristics of WLAM, delving into their unique relevance in wireless environments. Then, the role of WLAM in optimizing wireless communication systems across various use cases and the reciprocal benefits are systematically investigated. Furthermore, we discuss the integration of WLAM with emerging technologies, highlighting their potential to enable transformative capabilities and breakthroughs in wireless communication. Finally, we thoroughly examine the high-level challenges hindering the practical implementation of WLAM and discuss pivotal future research directions.

Via

Access Paper or Ask Questions

EdgePrompt: A Distributed Key-Value Inference Framework for LLMs in 6G Networks

Apr 16, 2025

Jiahong Ning, Pengyan Zhu, Ce Zheng, Gary Lee, Sumei Sun, Tingting Yang

Abstract:As sixth-generation (6G) networks advance, large language models (LLMs) are increasingly integrated into 6G infrastructure to enhance network management and intelligence. However, traditional LLMs architecture struggle to meet the stringent latency and security requirements of 6G, especially as the increasing in sequence length leads to greater task complexity. This paper proposes Edge-Prompt, a cloud-edge collaborative framework based on a hierarchical attention splicing mechanism. EdgePrompt employs distributed key-value (KV) pair optimization techniques to accelerate inference and adapt to network conditions. Additionally, to reduce the risk of data leakage, EdgePrompt incorporates a privacy preserving strategy by isolating sensitive information during processing. Experiments on public dataset show that EdgePrompt effectively improves the inference throughput and reduces the latency, which provides a reliable solution for LLMs deployment in 6G environments.

Via

Access Paper or Ask Questions

MergeQuant: Accurate 4-bit Static Quantization of Large Language Models by Channel-wise Calibration

Mar 07, 2025

Jinguang Wang, Jingyu Wang, Haifeng Sun, Tingting Yang, Zirui Zhuang, Wanyi Ning, Yuexi Yin, Qi Qi, Jianxin Liao

Abstract:Quantization has been widely used to compress and accelerate inference of large language models (LLMs). Existing methods focus on exploring the per-token dynamic calibration to ensure both inference acceleration and model accuracy under 4-bit quantization. However, in autoregressive generation inference of long sequences, the overhead of repeated dynamic quantization and dequantization steps becomes considerably expensive. In this work, we propose MergeQuant, an accurate and efficient per-channel static quantization framework. MergeQuant integrates the per-channel quantization steps with the corresponding scalings and linear mappings through a Quantization Step Migration (QSM) method, thereby eliminating the quantization overheads before and after matrix multiplication. Furthermore, in view of the significant differences between the different channel ranges, we propose dimensional reconstruction and adaptive clipping to address the non-uniformity of quantization scale factors and redistribute the channel variations to the subsequent modules to balance the parameter distribution under QSM. Within the static quantization setting of W4A4, MergeQuant reduces the accuracy gap on zero-shot tasks compared to FP16 baseline to 1.3 points on Llama-2-70B model. On Llama-2-7B model, MergeQuant achieves up to 1.77x speedup in decoding, and up to 2.06x speedup in end-to-end compared to FP16 baseline.

Via

Access Paper or Ask Questions

WirelessGPT: A Generative Pre-trained Multi-task Learning Framework for Wireless Communication

Feb 08, 2025

Tingting Yang, Ping Zhang, Mengfan Zheng, Yuxuan Shi, Liwen Jing, Jianbo Huang, Nan Li

Abstract:This paper introduces WirelessGPT, a pioneering foundation model specifically designed for multi-task learning in wireless communication and sensing. Specifically, WirelessGPT leverages large-scale wireless channel datasets for unsupervised pretraining and extracting universal channel representations, which captures complex spatiotemporal dependencies. In fact,this task-agnostic design adapts WirelessGPT seamlessly to a wide range of downstream tasks, using a unified representation with minimal fine-tuning. By unifying communication and sensing functionalities, WirelessGPT addresses the limitations of task-specific models, offering a scalable and efficient solution for integrated sensing and communication (ISAC). With an initial parameter size of around 80 million, WirelessGPT demonstrates significant improvements over conventional methods and smaller AI models, reducing reliance on large-scale labeled data. As the first foundation model capable of supporting diverse tasks across different domains, WirelessGPT establishes a new benchmark, paving the way for future advancements in multi-task wireless systems.

* 8 pages, 4 figures

Via

Access Paper or Ask Questions

TSBP: Improving Object Detection in Histology Images via Test-time Self-guided Bounding-box Propagation

Sep 25, 2024

Tingting Yang, Liang Xiao, Yizhe Zhang

Figure 1 for TSBP: Improving Object Detection in Histology Images via Test-time Self-guided Bounding-box Propagation

Figure 2 for TSBP: Improving Object Detection in Histology Images via Test-time Self-guided Bounding-box Propagation

Figure 3 for TSBP: Improving Object Detection in Histology Images via Test-time Self-guided Bounding-box Propagation

Figure 4 for TSBP: Improving Object Detection in Histology Images via Test-time Self-guided Bounding-box Propagation

Abstract:A global threshold (e.g., 0.5) is often applied to determine which bounding boxes should be included in the final results for an object detection task. A higher threshold reduces false positives but may result in missing a significant portion of true positives. A lower threshold can increase detection recall but may also result in more false positives. Because of this, using a preset global threshold (e.g., 0.5) applied to all the bounding box candidates may lead to suboptimal solutions. In this paper, we propose a Test-time Self-guided Bounding-box Propagation (TSBP) method, leveraging Earth Mover's Distance (EMD) to enhance object detection in histology images. TSBP utilizes bounding boxes with high confidence to influence those with low confidence, leveraging visual similarities between them. This propagation mechanism enables bounding boxes to be selected in a controllable, explainable, and robust manner, which surpasses the effectiveness of using simple thresholds and uncertainty calibration methods. Importantly, TSBP does not necessitate additional labeled samples for model training or parameter estimation, unlike calibration methods. We conduct experiments on gland detection and cell detection tasks in histology images. The results show that our proposed TSBP significantly improves detection outcomes when working in conjunction with state-of-the-art deep learning-based detection networks. Compared to other methods such as uncertainty calibration, TSBP yields more robust and accurate object detection predictions while using no additional labeled samples. The code is available at https://github.com/jwhgdeu/TSBP.

* MICCAI 2024

Via

Access Paper or Ask Questions

Micro-Expression Recognition by Motion Feature Extraction based on Pre-training

Jul 10, 2024

Ruolin Li, Lu Wang, Tingting Yang, Lisheng Xu, Bingyang Ma, Yongchun Li, Hongchao Wei

Figure 1 for Micro-Expression Recognition by Motion Feature Extraction based on Pre-training

Figure 2 for Micro-Expression Recognition by Motion Feature Extraction based on Pre-training

Figure 3 for Micro-Expression Recognition by Motion Feature Extraction based on Pre-training

Figure 4 for Micro-Expression Recognition by Motion Feature Extraction based on Pre-training

Abstract:Micro-expressions (MEs) are spontaneous, unconscious facial expressions that have promising applications in various fields such as psychotherapy and national security. Thus, micro-expression recognition (MER) has attracted more and more attention from researchers. Although various MER methods have emerged especially with the development of deep learning techniques, the task still faces several challenges, e.g. subtle motion and limited training data. To address these problems, we propose a novel motion extraction strategy (MoExt) for the MER task and use additional macro-expression data in the pre-training process. We primarily pretrain the feature separator and motion extractor using the contrastive loss, thus enabling them to extract representative motion features. In MoExt, shape features and texture features are first extracted separately from onset and apex frames, and then motion features related to MEs are extracted based on the shape features of both frames. To enable the model to more effectively separate features, we utilize the extracted motion features and the texture features from the onset frame to reconstruct the apex frame. Through pre-training, the module is enabled to extract inter-frame motion features of facial expressions while excluding irrelevant information. The feature separator and motion extractor are ultimately integrated into the MER network, which is then fine-tuned using the target ME data. The effectiveness of proposed method is validated on three commonly used datasets, i.e., CASME II, SMIC, SAMM, and CAS(ME)3 dataset. The results show that our method performs favorably against state-of-the-art methods.

Via

Access Paper or Ask Questions

OutlierTune: Efficient Channel-Wise Quantization for Large Language Models

Jun 27, 2024

Jinguang Wang, Yuexi Yin, Haifeng Sun, Qi Qi, Jingyu Wang, Zirui Zhuang, Tingting Yang, Jianxin Liao

Figure 1 for OutlierTune: Efficient Channel-Wise Quantization for Large Language Models

Figure 2 for OutlierTune: Efficient Channel-Wise Quantization for Large Language Models

Figure 3 for OutlierTune: Efficient Channel-Wise Quantization for Large Language Models

Figure 4 for OutlierTune: Efficient Channel-Wise Quantization for Large Language Models

Abstract:Quantizing the activations of large language models (LLMs) has been a significant challenge due to the presence of structured outliers. Most existing methods focus on the per-token or per-tensor quantization of activations, making it difficult to achieve both accuracy and hardware efficiency. To address this problem, we propose OutlierTune, an efficient per-channel post-training quantization (PTQ) method for the activations of LLMs. OutlierTune consists of two components: pre-execution of dequantization and symmetrization. The pre-execution of dequantization updates the model weights by the activation scaling factors, avoiding the internal scaling and costly additional computational overheads brought by the per-channel activation quantization. The symmetrization further reduces the quantization differences arising from the weight updates by ensuring the balanced numerical ranges across different activation channels. OutlierTune is easy to implement and hardware-efficient, introducing almost no additional computational overheads during the inference. Extensive experiments show that the proposed framework outperforms existing methods across multiple different tasks. Demonstrating better generalization, this framework improves the Int6 quantization of the instruction-tuning LLMs, such as OPT-IML, to the same level as half-precision (FP16). Moreover, we have shown that the proposed framework is 1.48x faster than the FP16 implementation while reducing approximately 2x memory usage.

Via

Access Paper or Ask Questions

Multi-channel Deep 3D Face Recognition

Sep 30, 2020

Zhiqian You, Tingting Yang, Miao Jin

Figure 1 for Multi-channel Deep 3D Face Recognition

Figure 2 for Multi-channel Deep 3D Face Recognition

Figure 3 for Multi-channel Deep 3D Face Recognition

Figure 4 for Multi-channel Deep 3D Face Recognition

Abstract:Face recognition has been of great importance in many applications as a biometric for its throughput, convenience, and non-invasiveness. Recent advancements in deep Convolutional Neural Network (CNN) architectures have boosted significantly the performance of face recognition based on two-dimensional (2D) facial texture images and outperformed the previous state of the art using conventional methods. However, the accuracy of 2D face recognition is still challenged by the change of pose, illumination, make-up, and expression. On the other hand, the geometric information contained in three-dimensional (3D) face data has the potential to overcome the fundamental limitations of 2D face data. We propose a multi-Channel deep 3D face network for face recognition based on 3D face data. We compute the geometric information of a 3D face based on its piecewise-linear triangular mesh structure and then conformally flatten geometric information along with the color from 3D to 2D plane to leverage the state-of-the-art deep CNN architectures. We modify the input layer of the network to take images with nine channels instead of three only such that more geometric information can be explicitly fed to it. We pre-train the network using images from the VGG-Face \cite{Parkhi2015} and then fine-tune it with the generated multi-channel face images. The face recognition accuracy of the multi-Channel deep 3D face network has achieved 98.6. The experimental results also clearly show that the network performs much better when a 9-channel image is flattened to plane based on the conformal map compared with the orthographic projection.

Via

Access Paper or Ask Questions

6G White Paper on Edge Intelligence

Apr 30, 2020

Ella Peltonen, Mehdi Bennis, Michele Capobianco, Merouane Debbah, Aaron Ding, Felipe Gil-Castiñeira, Marko Jurmu, Teemu Karvonen, Markus Kelanti, Adrian Kliks(+9 more)

Figure 1 for 6G White Paper on Edge Intelligence

Figure 2 for 6G White Paper on Edge Intelligence

Figure 3 for 6G White Paper on Edge Intelligence

Figure 4 for 6G White Paper on Edge Intelligence

Abstract:In this white paper we provide a vision for 6G Edge Intelligence. Moving towards 5G and beyond the future 6G networks, intelligent solutions utilizing data-driven machine learning and artificial intelligence become crucial for several real-world applications including but not limited to, more efficient manufacturing, novel personal smart device environments and experiences, urban computing and autonomous traffic settings. We present edge computing along with other 6G enablers as a key component to establish the future 2030 intelligent Internet technologies as shown in this series of 6G White Papers. In this white paper, we focus in the domains of edge computing infrastructure and platforms, data and edge network management, software development for edge, and real-time and distributed training of ML/AI algorithms, along with security, privacy, pricing, and end-user aspects. We discuss the key enablers and challenges and identify the key research questions for the development of the Intelligent Edge services. As a main outcome of this white paper, we envision a transition from Internet of Things to Intelligent Internet of Intelligent Things and provide a roadmap for development of 6G Intelligent Edge.

Via

Access Paper or Ask Questions