Abstract:The rapid development of the quantum technology presents huge opportunities for 6G communications. Leveraging the quantum properties of highly excited Rydberg atoms, Rydberg atom-based antennas present distinct advantages, such as high sensitivity, broad frequency range, and compact size, over traditional antennas. To realize efficient precoding, accurate channel state information is essential. However, due to the distinct characteristics of atomic receivers, traditional channel estimation algorithms developed for conventional receivers are no longer applicable. To this end, we propose a novel channel estimation algorithm based on projection gradient descent (PGD), which is applicable to both one-dimensional (1D) and twodimensional (2D) arrays. Simulation results are provided to show the effectiveness of our proposed channel estimation method.
Abstract:The recently emerged movable antenna (MA) shows great promise in leveraging spatial degrees of freedom to enhance the performance of wireless systems. However, resource allocation in MA-aided systems faces challenges due to the nonconvex and coupled constraints on antenna positions. This paper systematically reveals the challenges posed by the minimum antenna separation distance constraints. Furthermore, we propose a penalty optimization framework for resource allocation under such new constraints for MA-aided systems. Specifically, the proposed framework separates the non-convex and coupled antenna distance constraints from the movable region constraints by introducing auxiliary variables. Subsequently, the resulting problem is efficiently solved by alternating optimization, where the optimization of the original variables resembles that in conventional resource allocation problem while the optimization with respect to the auxiliary variables is achieved in closedform solutions. To illustrate the effectiveness of the proposed framework, we present three case studies: capacity maximization, latency minimization, and regularized zero-forcing precoding. Simulation results demonstrate that the proposed optimization framework consistently outperforms state-of-the-art schemes.
Abstract:In the realm of activity detection for massive machine-type communications, intelligent reflecting surfaces (IRS) have shown significant potential in enhancing coverage for devices lacking direct connections to the base station (BS). However, traditional activity detection methods are typically designed for a single type of channel model, which does not reflect the complexities of real-world scenarios, particularly in systems incorporating IRS. To address this challenge, this paper introduces a novel approach that combines model-driven deep unfolding with a mixture of experts (MoE) framework. By automatically selecting one of three expert designs and applying it to the unfolded projected gradient method, our approach eliminates the need for prior knowledge of channel types between devices and the BS. Simulation results demonstrate that the proposed MoE-augmented deep unfolding method surpasses the traditional covariance-based method and black-box neural network design, delivering superior detection performance under mixed channel fading conditions.
Abstract:Learning-based street scene semantic understanding in autonomous driving (AD) has advanced significantly recently, but the performance of the AD model is heavily dependent on the quantity and quality of the annotated training data. However, traditional manual labeling involves high cost to annotate the vast amount of required data for training robust model. To mitigate this cost of manual labeling, we propose a Label Anything Model (denoted as LAM), serving as an interpretable, high-fidelity, and prompt-free data annotator. Specifically, we firstly incorporate a pretrained Vision Transformer (ViT) to extract the latent features. On top of ViT, we propose a semantic class adapter (SCA) and an optimization-oriented unrolling algorithm (OptOU), both with a quite small number of trainable parameters. SCA is proposed to fuse ViT-extracted features to consolidate the basis of the subsequent automatic annotation. OptOU consists of multiple cascading layers and each layer contains an optimization formulation to align its output with the ground truth as closely as possible, though which OptOU acts as being interpretable rather than learning-based blackbox nature. In addition, training SCA and OptOU requires only a single pre-annotated RGB seed image, owing to their small volume of learnable parameters. Extensive experiments clearly demonstrate that the proposed LAM can generate high-fidelity annotations (almost 100% in mIoU) for multiple real-world datasets (i.e., Camvid, Cityscapes, and Apolloscapes) and CARLA simulation dataset.
Abstract:To improve the generalization of the autonomous driving (AD) perception model, vehicles need to update the model over time based on the continuously collected data. As time progresses, the amount of data fitted by the AD model expands, which helps to improve the AD model generalization substantially. However, such ever-expanding data is a double-edged sword for the AD model. Specifically, as the fitted data volume grows to exceed the the AD model's fitting capacities, the AD model is prone to under-fitting. To address this issue, we propose to use a pretrained Large Vision Models (LVMs) as backbone coupled with downstream perception head to understand AD semantic information. This design can not only surmount the aforementioned under-fitting problem due to LVMs' powerful fitting capabilities, but also enhance the perception generalization thanks to LVMs' vast and diverse training data. On the other hand, to mitigate vehicles' computational burden of training the perception head while running LVM backbone, we introduce a Posterior Optimization Trajectory (POT)-Guided optimization scheme (POTGui) to accelerate the convergence. Concretely, we propose a POT Generator (POTGen) to generate posterior (future) optimization direction in advance to guide the current optimization iteration, through which the model can generally converge within 10 epochs. Extensive experiments demonstrate that the proposed method improves the performance by over 66.48\% and converges faster over 6 times, compared to the existing state-of-the-art approach.
Abstract:This work considers a spatial non-stationary channel tracking problem in broadband extremely large-scale multiple-input-multiple-output (XL-MIMO) systems. In the case of spatial non-stationary, each scatterer has a certain visibility region (VR) over antennas and power change may occur among visible antennas. Concentrating on the temporal correlation of XL-MIMO channels, we design a three-layer Markov prior model and hierarchical two-dimensional (2D) Markov model to exploit the dynamic sparsity of sparse channel vectors and VRs, respectively. Then, we formulate the channel tracking problem as a bilinear measurement process, and a novel dynamic alternating maximum a posteriori (DA-MAP) framework is developed to solve the problem. The DA-MAP contains four basic modules: channel estimation module, VR detection module, grid update module, and temporal correlated module. Specifically, the first module is an inverse-free variational Bayesian inference (IF-VBI) estimator that avoids computational intensive matrix inverse each iteration; the second module is a turbo compressive sensing (Turbo-CS) algorithm that only needs small-scale matrix operations in a parallel fashion; the third module refines the polar-delay domain grid; and the fourth module can process the temporal prior information to ensure high-efficiency channel tracking. Simulations show that the proposed method can achieve a significant channel tracking performance while achieving low computational overhead.
Abstract:The key technologies of sixth generation (6G), such as ultra-massive multiple-input multiple-output (MIMO), enable intricate interactions between antennas and wireless propagation environments. As a result, it becomes necessary to develop joint models that encompass both antennas and wireless propagation channels. To achieve this, we utilize the multi-port communication theory, which considers impedance matching among the source, transmission medium, and load to facilitate efficient power transfer. Specifically, we first investigate the impact of insertion loss, mutual coupling, and other factors on the performance of multi-port matching networks. Next, to further improve system performance, we explore two important deep unfolding designs for the multi-port matching networks: beamforming and power control, respectively. For the hybrid beamforming, we develop a deep unfolding framework, i.e., projected gradient descent (PGD)-Net based on unfolding projected gradient descent. For the power control, we design a deep unfolding network, graph neural network (GNN) aided alternating optimization (AO)Net, which considers the interaction between different ports in optimizing power allocation. Numerical results verify the necessity of considering insertion loss in the dynamic metasurface antenna (DMA) performance analysis. Besides, the proposed PGD-Net based hybrid beamforming approaches approximate the conventional model-based algorithm with very low complexity. Moreover, our proposed power control scheme has a fast run time compared to the traditional weighted minimum mean squared error (WMMSE) method.
Abstract:Street Scene Semantic Understanding (denoted as TriSU) is a complex task for autonomous driving (AD). However, inference model trained from data in a particular geographical region faces poor generalization when applied in other regions due to inter-city data domain-shift. Hierarchical Federated Learning (HFL) offers a potential solution for improving TriSU model generalization by collaborative privacy-preserving training over distributed datasets from different cities. Unfortunately, it suffers from slow convergence because data from different cities are with disparate statistical properties. Going beyond existing HFL methods, we propose a Gaussian heterogeneous HFL algorithm (FedGau) to address inter-city data heterogeneity so that convergence can be accelerated. In the proposed FedGau algorithm, both single RGB image and RGB dataset are modelled as Gaussian distributions for aggregation weight design. This approach not only differentiates each RGB image by respective statistical distribution, but also exploits the statistics of dataset from each city in addition to the conventionally considered data volume. With the proposed approach, the convergence is accelerated by 35.5\%-40.6\% compared to existing state-of-the-art (SOTA) HFL methods. On the other hand, to reduce the involved communication resource, we further introduce a novel performance-aware adaptive resource scheduling (AdapRS) policy. Unlike the traditional static resource scheduling policy that exchanges a fixed number of models between two adjacent aggregations, AdapRS adjusts the number of model aggregation at different levels of HFL so that unnecessary communications are minimized. Extensive experiments demonstrate that AdapRS saves 29.65\% communication overhead compared to conventional static resource scheduling policy while maintaining almost the same performance.
Abstract:To efficiently express tensor data using the Tucker format, a critical task is to minimize the multilinear rank such that the model would not be over-flexible and lead to overfitting. Due to the lack of rank minimization tools in tensor, existing works connect Tucker multilinear rank minimization to trace norm minimization of matrices unfolded from the tensor data. While these formulations try to exploit the common aim of identifying the low-dimensional structure of the tensor and matrix, this paper reveals that existing trace norm-based formulations in Tucker completion are inefficient in multilinear rank minimization. We further propose a new interpretation of Tucker format such that trace norm minimization is applied to the factor matrices of the equivalent representation, rather than some matrices unfolded from tensor data. Based on the newly established problem formulation, a fixed point iteration algorithm is proposed, and its convergence is proved. Numerical results are presented to show that the proposed algorithm exhibits significant improved performance in terms of multilinear rank learning and consequently tensor signal recovery accuracy, compared to existing trace norm based Tucker completion methods.
Abstract:The movable antenna (MA) is a promising technology to exploit more spatial degrees of freedom for enhancing wireless system performance. However, the MA-aided system introduces the non-convex antenna distance constraints, which poses challenges in the underlying optimization problems. To fill this gap, this paper proposes a general framework for optimizing the MA-aided system under the antenna distance constraints. Specifically, we separate the non-convex antenna distance constraints from the objective function by introducing auxiliary variables. Then, the resulting problem can be efficiently solved under the alternating optimization framework. For the subproblems with respect to the antenna position variables and auxiliary variables, the proposed algorithms are able to obtain at least stationary points without any approximations. To verify the effectiveness of the proposed optimization framework, we present two case studies: capacity maximization and regularized zero-forcing precoding. Simulation results demonstrate the proposed optimization framework outperforms the existing baseline schemes under both cases.