Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Peizheng Li

A Heuristic-Integrated DRL Approach for Phase Optimization in Large-Scale RISs

May 07, 2025

Wei Wang, Peizheng Li, Angela Doufexi, Mark A. Beach

Abstract:Optimizing discrete phase shifts in large-scale reconfigurable intelligent surfaces (RISs) is challenging due to their non-convex and non-linear nature. In this letter, we propose a heuristic-integrated deep reinforcement learning (DRL) framework that (1) leverages accumulated actions over multiple steps in the double deep Q-network (DDQN) for RIS column-wise control and (2) integrates a greedy algorithm (GA) into each DRL step to refine the state via fine-grained, element-wise optimization of RIS configurations. By learning from GA-included states, the proposed approach effectively addresses RIS optimization within a small DRL action space, demonstrating its capability to optimize phase-shift configurations of large-scale RISs.

* 5 pages, 5 figures. This work has been accepted for publication in IEEE Communications Letters

Via

Access Paper or Ask Questions

AGO: Adaptive Grounding for Open World 3D Occupancy Prediction

Apr 14, 2025

Peizheng Li, Shuxiao Ding, You Zhou, Qingwen Zhang, Onat Inak, Larissa Triess, Niklas Hanselmann, Marius Cordts, Andreas Zell

Abstract:Open-world 3D semantic occupancy prediction aims to generate a voxelized 3D representation from sensor inputs while recognizing both known and unknown objects. Transferring open-vocabulary knowledge from vision-language models (VLMs) offers a promising direction but remains challenging. However, methods based on VLM-derived 2D pseudo-labels with traditional supervision are limited by a predefined label space and lack general prediction capabilities. Direct alignment with pretrained image embeddings, on the other hand, fails to achieve reliable performance due to often inconsistent image and text representations in VLMs. To address these challenges, we propose AGO, a novel 3D occupancy prediction framework with adaptive grounding to handle diverse open-world scenarios. AGO first encodes surrounding images and class prompts into 3D and text embeddings, respectively, leveraging similarity-based grounding training with 3D pseudo-labels. Additionally, a modality adapter maps 3D embeddings into a space aligned with VLM-derived image embeddings, reducing modality gaps. Experiments on Occ3D-nuScenes show that AGO improves unknown object prediction in zero-shot and few-shot transfer while achieving state-of-the-art closed-world self-supervised performance, surpassing prior methods by 4.09 mIoU.

Via

Access Paper or Ask Questions

TQD-Track: Temporal Query Denoising for 3D Multi-Object Tracking

Apr 04, 2025

Shuxiao Ding, Yutong Yang, Julian Wiederer, Markus Braun, Peizheng Li, Juergen Gall, Bin Yang

Abstract:Query denoising has become a standard training strategy for DETR-based detectors by addressing the slow convergence issue. Besides that, query denoising can be used to increase the diversity of training samples for modeling complex scenarios which is critical for Multi-Object Tracking (MOT), showing its potential in MOT application. Existing approaches integrate query denoising within the tracking-by-attention paradigm. However, as the denoising process only happens within the single frame, it cannot benefit the tracker to learn temporal-related information. In addition, the attention mask in query denoising prevents information exchange between denoising and object queries, limiting its potential in improving association using self-attention. To address these issues, we propose TQD-Track, which introduces Temporal Query Denoising (TQD) tailored for MOT, enabling denoising queries to carry temporal information and instance-specific feature representation. We introduce diverse noise types onto denoising queries that simulate real-world challenges in MOT. We analyze our proposed TQD for different tracking paradigms, and find out the paradigm with explicit learned data association module, e.g. tracking-by-detection or alternating detection and association, benefit from TQD by a larger margin. For these paradigms, we further design an association mask in the association module to ensure the consistent interaction between track and detection queries as during inference. Extensive experiments on the nuScenes dataset demonstrate that our approach consistently enhances different tracking methods by only changing the training process, especially the paradigms with explicit association module.

Via

Access Paper or Ask Questions

Green MLOps to Green GenOps: An Empirical Study of Energy Consumption in Discriminative and Generative AI Operations

Mar 31, 2025

Adrián Sánchez-Mompó, Ioannis Mavromatis, Peizheng Li, Konstantinos Katsaros, Aftab Khan

Abstract:This study presents an empirical investigation into the energy consumption of Discriminative and Generative AI models within real-world MLOps pipelines. For Discriminative models, we examine various architectures and hyperparameters during training and inference and identify energy-efficient practices. For Generative AI, Large Language Models (LLMs) are assessed, focusing primarily on energy consumption across different model sizes and varying service requests. Our study employs software-based power measurements, ensuring ease of replication across diverse configurations, models, and datasets. We analyse multiple models and hardware setups to uncover correlations among various metrics, identifying key contributors to energy consumption. The results indicate that for Discriminative models, optimising architectures, hyperparameters, and hardware can significantly reduce energy consumption without sacrificing performance. For LLMs, energy efficiency depends on balancing model size, reasoning complexity, and request-handling capacity, as larger models do not necessarily consume more energy when utilisation remains low. This analysis provides practical guidelines for designing green and sustainable ML operations, emphasising energy consumption and carbon footprint reductions while maintaining performance. This paper can serve as a benchmark for accurately estimating total energy use across different types of AI models.

* Published to MDPI Information - Artificial Intelligence Section

Via

Access Paper or Ask Questions

Task-Oriented Connectivity for Networked Robotics with Generative AI and Semantic Communications

Mar 09, 2025

Peizheng Li, Adnan Aijaz

Abstract:The convergence of robotics, advanced communication networks, and artificial intelligence (AI) holds the promise of transforming industries through fully automated and intelligent operations. In this work, we introduce a novel co-working framework for robots that unifies goal-oriented semantic communication (SemCom) with a Generative AI (GenAI)-agent under a semantic-aware network. SemCom prioritizes the exchange of meaningful information among robots and the network, thereby reducing overhead and latency. Meanwhile, the GenAI-agent leverages generative AI models to interpret high-level task instructions, allocate resources, and adapt to dynamic changes in both network and robotic environments. This agent-driven paradigm ushers in a new level of autonomy and intelligence, enabling complex tasks of networked robots to be conducted with minimal human intervention. We validate our approach through a multi-robot anomaly detection use-case simulation, where robots detect, compress, and transmit relevant information for classification. Simulation results confirm that SemCom significantly reduces data traffic while preserving critical semantic details, and the GenAI-agent ensures task coordination and network adaptation. This synergy provides a robust, efficient, and scalable solution for modern industrial environments.

* 6 pages, 7 figures. This paper has been submitted to IEEE for possible publication

Via

Access Paper or Ask Questions

Building the Self-Improvement Loop: Error Detection and Correction in Goal-Oriented Semantic Communications

Nov 03, 2024

Peizheng Li, Xinyi Lin, Adnan Aijaz

Abstract:Error detection and correction are essential for ensuring robust and reliable operation in modern communication systems, particularly in complex transmission environments. However, discussions on these topics have largely been overlooked in semantic communication (SemCom), which focuses on transmitting meaning rather than symbols, leading to significant improvements in communication efficiency. Despite these advantages, semantic errors -- stemming from discrepancies between transmitted and received meanings -- present a major challenge to system reliability. This paper addresses this gap by proposing a comprehensive framework for detecting and correcting semantic errors in SemCom systems. We formally define semantic error, detection, and correction mechanisms, and identify key sources of semantic errors. To address these challenges, we develop a Gaussian process (GP)-based method for latent space monitoring to detect errors, alongside a human-in-the-loop reinforcement learning (HITL-RL) approach to optimize semantic model configurations using user feedback. Experimental results validate the effectiveness of the proposed methods in mitigating semantic errors under various conditions, including adversarial attacks, input feature changes, physical channel variations, and user preference shifts. This work lays the foundation for more reliable and adaptive SemCom systems with robust semantic error management techniques.

* 7 pages, 8 figures, this paper has been accepted for publication in IEEE CSCN 2024

Via

Access Paper or Ask Questions

From Hype to Reality: The Road Ahead of Deploying DRL in 6G Networks

Oct 30, 2024

Haiyuan Li, Hari Madhukumar, Peizheng Li, Yiran Teng, Shuangyi Yan, Dimitra Simeonidou

Abstract:The industrial landscape is rapidly evolving with the advent of 6G applications, which demand massive connectivity, high computational capacity, and ultra-low latency. These requirements present new challenges, which can no longer be efficiently addressed by conventional strategies. In response, this article underscores the transformative potential of Deep Reinforcement Learning (DRL) for 6G, highlighting its advantages over classic machine learning solutions in meeting the demands of 6G. The necessity of DRL is further validated through three DRL applications in an end-to-end communication procedure, including wireless access control, baseband function placement, and network slicing coordination. However, DRL-based network management initiatives are far from mature. We extend the discussion to identify the challenges of applying DRL in practical networks and explore potential solutions along with their respective limitations. In the end, these insights are validated through a practical DRL deployment in managing network slices on the testbed.

Via

Access Paper or Ask Questions

Adapting MLOps for Diverse In-Network Intelligence in 6G Era: Challenges and Solutions

Oct 24, 2024

Peizheng Li, Ioannis Mavromatis, Tim Farnham, Adnan Aijaz, Aftab Khan

Figure 1 for Adapting MLOps for Diverse In-Network Intelligence in 6G Era: Challenges and Solutions

Figure 2 for Adapting MLOps for Diverse In-Network Intelligence in 6G Era: Challenges and Solutions

Figure 3 for Adapting MLOps for Diverse In-Network Intelligence in 6G Era: Challenges and Solutions

Figure 4 for Adapting MLOps for Diverse In-Network Intelligence in 6G Era: Challenges and Solutions

Abstract:Seamless integration of artificial intelligence (AI) and machine learning (ML) techniques with wireless systems is a crucial step for 6G AInization. However, such integration faces challenges in terms of model functionality and lifecycle management. ML operations (MLOps) offer a systematic approach to tackle these challenges. Existing approaches toward implementing MLOps in a centralized platform often overlook the challenges posed by diverse learning paradigms and network heterogeneity. This article provides a new approach to MLOps targeting the intricacies of future wireless networks. Considering unique aspects of the future radio access network (RAN), we formulate three operational pipelines, namely reinforcement learning operations (RLOps), federated learning operations (FedOps), and generative AI operations (GenOps). These pipelines form the foundation for seamlessly integrating various learning/inference capabilities into networks. We outline the specific challenges and proposed solutions for each operation, facilitating large-scale deployment of AI-Native 6G networks.

* 7 pages, 5 figures. This paper has been submitted to IEEE for possible publication

Via

Access Paper or Ask Questions

Modeling and Analysis of Multi-Line Orders in Multi-Tote Storage and Retrieval Autonomous Mobile Robot Systems

Jul 08, 2024

Xiaotao Shan, Yichao Jin, Peizheng Li, Koichi Kondo

Figure 1 for Modeling and Analysis of Multi-Line Orders in Multi-Tote Storage and Retrieval Autonomous Mobile Robot Systems

Figure 2 for Modeling and Analysis of Multi-Line Orders in Multi-Tote Storage and Retrieval Autonomous Mobile Robot Systems

Figure 3 for Modeling and Analysis of Multi-Line Orders in Multi-Tote Storage and Retrieval Autonomous Mobile Robot Systems

Figure 4 for Modeling and Analysis of Multi-Line Orders in Multi-Tote Storage and Retrieval Autonomous Mobile Robot Systems

Abstract:As warehouses are emphasizing space utilization and the ability to handle multi-line orders, multi-tote storage and retrieval (MTSR) autonomous mobile robot systems, where robots directly retrieve totes from high shelves, are becoming increasingly popular. This paper presents a novel shared-token, multi-class, semi-open queueing network model to account for multi-line orders with general distribution forms in MTSR systems. The numerical results obtained from solving the SOQN model are validated against discrete-event simulation, with most key performance metrics demonstrating high accuracy. In our experimental setting, results indicate a 12.5% reduction in the minimum number of robots needed to satisfy a specific order arrival rate using the closest retrieval sequence policy compared with the random policy. Increasing the number of tote buffer positions on a robot can greatly reduce the number of robots required in the warehouse.

* 8 pages, 5 figures. This paper has been accepted for publication in IEEE 20th International Conference on Automation Science and Engineering (IEEE CASE 2024)

Via

Access Paper or Ask Questions

SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Jul 01, 2024

Qingwen Zhang, Yi Yang, Peizheng Li, Olov Andersson, Patric Jensfelt

Figure 1 for SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Figure 2 for SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Figure 3 for SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Figure 4 for SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Abstract:Scene flow estimation predicts the 3D motion at each point in successive LiDAR scans. This detailed, point-level, information can help autonomous vehicles to accurately predict and understand dynamic changes in their surroundings. Current state-of-the-art methods require annotated data to train scene flow networks and the expense of labeling inherently limits their scalability. Self-supervised approaches can overcome the above limitations, yet face two principal challenges that hinder optimal performance: point distribution imbalance and disregard for object-level motion constraints. In this paper, we propose SeFlow, a self-supervised method that integrates efficient dynamic classification into a learning-based scene flow pipeline. We demonstrate that classifying static and dynamic points helps design targeted objective functions for different motion patterns. We also emphasize the importance of internal cluster consistency and correct object point association to refine the scene flow estimation, in particular on object details. Our real-time capable method achieves state-of-the-art performance on the self-supervised scene flow task on Argoverse 2 and Waymo datasets. The code is open-sourced at https://github.com/KTH-RPL/SeFlow along with trained model weights.

* 25 pages (14 main pages + 11 supp materail), 5 figures

Via

Access Paper or Ask Questions