Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jia Guo

Pinching-Antenna Systems (PASS): A Tutorial

Aug 11, 2025

Yuanwei Liu, Hao Jiang, Xiaoxia Xu, Zhaolin Wang, Jia Guo, Chongjun Ouyang, Xidong Mu, Zhiguo Ding, Arumugam Nallanathan, George K. Karagiannidis(+1 more)

Abstract:Pinching antenna systems (PASS) present a breakthrough among the flexible-antenna technologies, and distinguish themselves by facilitating large-scale antenna reconfiguration, line-of-sight creation, scalable implementation, and near-field benefits, thus bringing wireless communications from the last mile to the last meter. A comprehensive tutorial is presented in this paper. First, the fundamentals of PASS are discussed, including PASS signal models, hardware models, power radiation models, and pinching antenna activation methods. Building upon this, the information-theoretic capacity limits achieved by PASS are characterized, and several typical performance metrics of PASS-based communications are analyzed to demonstrate its superiority over conventional antenna technologies. Next, the pinching beamforming design is investigated. The corresponding power scaling law is first characterized. For the joint transmit and pinching design in the general multiple-waveguide case, 1) a pair of transmission strategies is proposed for PASS-based single-user communications to validate the superiority of PASS, namely sub-connected and fully connected structures; and 2) three practical protocols are proposed for facilitating PASS-based multi-user communications, namely waveguide switching, waveguide division, and waveguide multiplexing. A possible implementation of PASS in wideband communications is further highlighted. Moreover, the channel state information acquisition in PASS is elaborated with a pair of promising solutions. To overcome the high complexity and suboptimality inherent in conventional convex-optimization-based approaches, machine-learning-based methods for operating PASS are also explored, focusing on selected deep neural network architectures and training algorithms. Finally, several promising applications of PASS in next-generation wireless networks are highlighted.

* Submitted to IEEE journal

Via

Access Paper or Ask Questions

When Attention is Beneficial for Learning Wireless Resource Allocation Efficiently?

Jul 03, 2025

Jia Guo, Chenyang Yang

Abstract:Owing to the use of attention mechanism to leverage the dependency across tokens, Transformers are efficient for natural language processing. By harnessing permutation properties broadly exist in resource allocation policies, each mapping measurable environmental parameters (e.g., channel matrix) to optimized variables (e.g., precoding matrix), graph neural networks (GNNs) are promising for learning these policies efficiently in terms of scalability and generalizability. To reap the benefits of both architectures, there is a recent trend of incorporating attention mechanism with GNNs for learning wireless policies. Nevertheless, is the attention mechanism really needed for resource allocation? In this paper, we strive to answer this question by analyzing the structures of functions defined on sets and numerical algorithms, given that the permutation properties of wireless policies are induced by the involved sets (say user set). In particular, we prove that the permutation equivariant functions on a single set can be recursively expressed by two types of functions: one involves attention, and the other does not. We proceed to re-express the numerical algorithms for optimizing several representative resource allocation problems in recursive forms. We find that when interference (say multi-user or inter-data stream interference) is not reflected in the measurable parameters of a policy, attention needs to be used to model the interference. With the insight, we establish a framework of designing GNNs by aligning with the structures. By taking reconfigurable intelligent surface-aided hybrid precoding as an example, the learning efficiency of the proposed GNN is validated via simulations.

* 13 pages, 6 figures

Via

Access Paper or Ask Questions

Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs

Jun 18, 2025

Ling Team, Bin Hu, Cai Chen, Deng Zhao, Ding Liu, Dingnan Jin, Feng Zhu, Hao Dai, Hongzhi Luan, Jia Guo(+36 more)

Abstract:We present Ring-lite, a Mixture-of-Experts (MoE)-based large language model optimized via reinforcement learning (RL) to achieve efficient and robust reasoning capabilities. Built upon the publicly available Ling-lite model, a 16.8 billion parameter model with 2.75 billion activated parameters, our approach matches the performance of state-of-the-art (SOTA) small-scale reasoning models on challenging benchmarks (e.g., AIME, LiveCodeBench, GPQA-Diamond) while activating only one-third of the parameters required by comparable models. To accomplish this, we introduce a joint training pipeline integrating distillation with RL, revealing undocumented challenges in MoE RL training. First, we identify optimization instability during RL training, and we propose Constrained Contextual Computation Policy Optimization(C3PO), a novel approach that enhances training stability and improves computational throughput via algorithm-system co-design methodology. Second, we empirically demonstrate that selecting distillation checkpoints based on entropy loss for RL training, rather than validation metrics, yields superior performance-efficiency trade-offs in subsequent RL training. Finally, we develop a two-stage training paradigm to harmonize multi-domain data integration, addressing domain conflicts that arise in training with mixed dataset. We will release the model, dataset, and code.

* Technical Report

Via

Access Paper or Ask Questions

Search is All You Need for Few-shot Anomaly Detection

Apr 16, 2025

Qishan Wang, Jia Guo, Shuyong Gao, Haofen Wang, Li Xiong, Junjie Hu, Hanqi Guo, Wenqiang Zhang

Figure 1 for Search is All You Need for Few-shot Anomaly Detection

Figure 2 for Search is All You Need for Few-shot Anomaly Detection

Figure 3 for Search is All You Need for Few-shot Anomaly Detection

Figure 4 for Search is All You Need for Few-shot Anomaly Detection

Abstract:Few-shot anomaly detection (FSAD) has emerged as a crucial yet challenging task in industrial inspection, where normal distribution modeling must be accomplished with only a few normal images. While existing approaches typically employ multi-modal foundation models combining language and vision modalities for prompt-guided anomaly detection, these methods often demand sophisticated prompt engineering and extensive manual tuning. In this paper, we demonstrate that a straightforward nearest-neighbor search framework can surpass state-of-the-art performance in both single-class and multi-class FSAD scenarios. Our proposed method, VisionAD, consists of four simple yet essential components: (1) scalable vision foundation models that extract universal and discriminative features; (2) dual augmentation strategies - support augmentation to enhance feature matching adaptability and query augmentation to address the oversights of single-view prediction; (3) multi-layer feature integration that captures both low-frequency global context and high-frequency local details with minimal computational overhead; and (4) a class-aware visual memory bank enabling efficient one-for-all multi-class detection. Extensive evaluations across MVTec-AD, VisA, and Real-IAD benchmarks demonstrate VisionAD's exceptional performance. Using only 1 normal images as support, our method achieves remarkable image-level AUROC scores of 97.4%, 94.8%, and 70.8% respectively, outperforming current state-of-the-art approaches by significant margins (+1.6%, +3.2%, and +1.4%). The training-free nature and superior few-shot capabilities of VisionAD make it particularly appealing for real-world applications where samples are scarce or expensive to obtain. Code is available at https://github.com/Qiqigeww/VisionAD.

Via

Access Paper or Ask Questions

Holistic Capability Preservation: Towards Compact Yet Comprehensive Reasoning Models

Apr 09, 2025

Ling Team, Caizhi Tang, Chilin Fu, Chunwei Wu, Jia Guo, Jianwen Wang, Jingyu Hu, Liang Jiang, Meng Li, Peng Jiao(+8 more)

Abstract:This technical report presents Ring-Lite-Distill, a lightweight reasoning model derived from our open-source Mixture-of-Experts (MoE) Large Language Models (LLMs) Ling-Lite. This study demonstrates that through meticulous high-quality data curation and ingenious training paradigms, the compact MoE model Ling-Lite can be further trained to achieve exceptional reasoning capabilities, while maintaining its parameter-efficient architecture with only 2.75 billion activated parameters, establishing an efficient lightweight reasoning architecture. In particular, in constructing this model, we have not merely focused on enhancing advanced reasoning capabilities, exemplified by high-difficulty mathematical problem solving, but rather aimed to develop a reasoning model with more comprehensive competency coverage. Our approach ensures coverage across reasoning tasks of varying difficulty levels while preserving generic capabilities, such as instruction following, tool use, and knowledge retention. We show that, Ring-Lite-Distill's reasoning ability reaches a level comparable to DeepSeek-R1-Distill-Qwen-7B, while its general capabilities significantly surpass those of DeepSeek-R1-Distill-Qwen-7B. The models are accessible at https://huggingface.co/inclusionAI

* 10 pages

Via

Access Paper or Ask Questions

Learning Precoding in Multi-user Multi-antenna Systems: Transformer or Graph Transformer?

Mar 04, 2025

Yuxuan Duan, Jia Guo, Chenyang Yang

Abstract:Transformers have been designed for channel acquisition tasks such as channel prediction and other tasks such as precoding, while graph neural networks (GNNs) have been demonstrated to be efficient for learning a multitude of communication tasks. Nonetheless, whether or not Transformers are efficient for the tasks other than channel acquisition and how to reap the benefits of both architectures are less understood. In this paper, we take learning precoding policies in multi-user multi-antenna systems as an example to answer the questions. We notice that a Transformer tailored for precoding can reflect multiuser interference, which is essential for its generalizability to the number of users. Yet the tailored Transformer can only leverage partial permutation property of precoding policies and hence is not generalizable to the number of antennas, same as a GNN learning over a homogeneous graph. To provide useful insight, we establish the relation between Transformers and the GNNs that learn over heterogeneous graphs. Based on the relation, we propose Graph Transformers, namely 2D- and 3D-Gformers, for exploiting the permutation properties of baseband precoding and hybrid precoding policies. The learning performance, inference and training complexity, and size-generalizability of the Gformers are evaluated and compared with Transformers and GNNs via simulations.

* 13 pages, 9 figures

Via

Access Paper or Ask Questions

Diffusion Model for Multiple Antenna Communications

Feb 03, 2025

Jia Guo, Xiaoxia Xu, Yuanwei Liu, Arumugam Nallanathan

Figure 1 for Diffusion Model for Multiple Antenna Communications

Figure 2 for Diffusion Model for Multiple Antenna Communications

Figure 3 for Diffusion Model for Multiple Antenna Communications

Figure 4 for Diffusion Model for Multiple Antenna Communications

Abstract:The potential of applying diffusion models (DMs) for multiple antenna communications is discussed. A unified framework of applying DM for multiple antenna tasks is first proposed. Then, the tasks are innovatively divided into two categories, i.e., decision-making tasks and generation tasks, depending on whether an optimization of system parameters is involved. For each category, it is conceived 1) how the framework can be used for each task and 2) why the DM is superior to traditional artificial intelligence (TAI) and conventional optimization tasks. It is highlighted that the DMs are well-suited for scenarios with strong interference and noise, excelling in modeling complex data distribution and exploring better actions. A case study of learning beamforming with a DM is then provided, to demonstrate the superiority of the DMs with simulation results. Finally, the applications of DM for emerging multiple antenna technologies and promising research directions are discussed.

* 7 pages, 4 figures

Via

Access Paper or Ask Questions

GPASS: Deep Learning for Beamforming in Pinching-Antenna Systems (PASS)

Feb 03, 2025

Jia Guo, Yuanwei Liu, Arumugam Nallanathan

Abstract:A novel GPASS architecture is proposed for jointly learning pinching beamforming and transmit beamforming in pinching antenna systems (PASS). The GPASS is with a staged architecture, where the positions of pinching antennas are first learned by a sub-GNN. Then, the transmit beamforming is learned by another sub-GNN based on the antenna positions. The sub-GNNs are incorporated with the permutation property of the beamforming policy, which helps improve the learning performance. The optimal solution structure of transmit beamforming is also leveraged to simplify the mappings to be learned. Numerical results demonstrate that the proposed architecture can achieve a higher SE than a heuristic baseline method with low inference complexity.

* 5 pages, 3 Figs

Via

Access Paper or Ask Questions

SailCompass: Towards Reproducible and Robust Evaluation for Southeast Asian Languages

Dec 02, 2024

Jia Guo, Longxu Dou, Guangtao Zeng, Stanley Kok, Wei Lu, Qian Liu

Figure 1 for SailCompass: Towards Reproducible and Robust Evaluation for Southeast Asian Languages

Figure 2 for SailCompass: Towards Reproducible and Robust Evaluation for Southeast Asian Languages

Figure 3 for SailCompass: Towards Reproducible and Robust Evaluation for Southeast Asian Languages

Figure 4 for SailCompass: Towards Reproducible and Robust Evaluation for Southeast Asian Languages

Abstract:In this paper, we introduce SailCompass, a reproducible and robust evaluation benchmark for assessing Large Language Models (LLMs) on Southeast Asian Languages (SEA). SailCompass encompasses three main SEA languages, eight primary tasks including 14 datasets covering three task types (generation, multiple-choice questions, and classification). To improve the robustness of the evaluation approach, we explore different prompt configurations for multiple-choice questions and leverage calibrations to improve the faithfulness of classification tasks. With SailCompass, we derive the following findings: (1) SEA-specialized LLMs still outperform general LLMs, although the gap has narrowed; (2) A balanced language distribution is important for developing better SEA-specialized LLMs; (3) Advanced prompting techniques (e.g., calibration, perplexity-based ranking) are necessary to better utilize LLMs. All datasets and evaluation scripts are public.

* code: https://github.com/sail-sg/sailcompass

Via

Access Paper or Ask Questions

Enhancing Brain Age Estimation with a Multimodal 3D CNN Approach Combining Structural MRI and AI-Synthesized Cerebral Blood Volume Data

Dec 01, 2024

Jordan Jomsky, Zongyu Li, Yiren Zhang, Jia Guo

Figure 1 for Enhancing Brain Age Estimation with a Multimodal 3D CNN Approach Combining Structural MRI and AI-Synthesized Cerebral Blood Volume Data

Figure 2 for Enhancing Brain Age Estimation with a Multimodal 3D CNN Approach Combining Structural MRI and AI-Synthesized Cerebral Blood Volume Data

Figure 3 for Enhancing Brain Age Estimation with a Multimodal 3D CNN Approach Combining Structural MRI and AI-Synthesized Cerebral Blood Volume Data

Figure 4 for Enhancing Brain Age Estimation with a Multimodal 3D CNN Approach Combining Structural MRI and AI-Synthesized Cerebral Blood Volume Data

Abstract:The growing global aging population necessitates enhanced methods for assessing brain aging and related neurodegenerative changes. Brain Age Gap Estimation (BrainAGE) offers a neuroimaging biomarker for understanding these changes by predicting brain age from MRI scans. Current approaches primarily use T1-weighted magnetic resonance imaging (T1w MRI) data, capturing only structural brain information. To address the lack of functional data, we integrated AI-generated Cerebral Blood Volume (AICBV) with T1w MRI, combining both structural and functional metrics. We developed a deep learning model using a VGG-based architecture to predict brain age. Our model achieved a mean absolute error (MAE) of 3.95 years and a correlation of \(R^2 = 0.94\) on the test set (\(n = 288\)), outperforming existing models trained on similar data. We have further created gradient-based class activation maps (Grad-CAM) to visualize the regions of the brain that most influenced the model's predictions, providing interpretable insights into the structural and functional contributors to brain aging.

* 10 pages, 5 figures

Via

Access Paper or Ask Questions