Abstract:Artificial intelligence generated content (AIGC) technologies, with a predominance of large language models (LLMs), have demonstrated remarkable performance improvements in various applications, which have attracted great interests from both academia and industry. Although some noteworthy advancements have been made in this area, a comprehensive exploration of the intricate relationship between AIGC and communication networks remains relatively limited. To address this issue, this paper conducts an exhaustive survey from dual standpoints: firstly, it scrutinizes the integration of LLMs and AIGC technologies within the domain of communication networks; secondly, it investigates how the communication networks can further bolster the capabilities of LLMs and AIGC. Additionally, this research explores the promising applications along with the challenges encountered during the incorporation of these AI technologies into communication networks. Through these detailed analyses, our work aims to deepen the understanding of how LLMs and AIGC can synergize with and enhance the development of advanced intelligent communication networks, contributing to a more profound comprehension of next-generation intelligent communication networks.
Abstract:Low-complexity Bayes-optimal memory approximate message passing (MAMP) is an efficient signal estimation algorithm in compressed sensing and multicarrier modulation. However, achieving replica Bayes optimality with MAMP necessitates a large-scale right-unitarily invariant transformation, which is prohibitive in practical systems due to its high computational complexity and hardware costs. To solve this difficulty, this letter proposes a low-complexity interleaved block-sparse (IBS) transform, which consists of interleaved multiple low-dimensional transform matrices, aimed at reducing the hardware implementation scale while mitigating performance loss. Furthermore, an IBS cross-domain memory approximate message passing (IBS-CD-MAMP) estimator is developed, comprising a memory linear estimator in the IBS transform domain and a non-linear estimator in the source domain. Numerical results show that the IBS-CD-MAMP offers a reduced implementation scale and lower complexity with excellent performance in IBS-based compressed sensing and interleave frequency division multiplexing systems.
Abstract:In this letter, we study interleave frequency division multiplexing (IFDM) for multicarrier modulation in static multipath and mobile time-varying channels, which outperforms orthogonal frequency division multiplexing (OFDM), orthogonal time frequency space (OTFS), and affine frequency division multiplexing (AFDM) by considering practical advanced detectors. The fundamental principle underlying existing modulation techniques is to establish sparse equivalent channel matrices in order to facilitate the design of low-complexity detection algorithms for signal recovery, making a trade-off between performance and implementation complexity. In contrast, the proposed IFDM establishes an equivalent fully dense and right-unitarily invariant channel matrix with the goal of achieving channel capacity, ensuring that the signals undergo sufficient statistical channel fading. Meanwhile, a low-complexity and replica maximum a posteriori (MAP)-optimal cross-domain memory approximate message passing (CD-MAMP) detector is proposed for IFDM by exploiting the sparsity of the time-domain channel and the unitary invariance in interleave-frequency-domain channel. Numerical results show that IFDM with extremely low-complexity CD-MAMP outperforms OFDM, OTFS, and AFDM with state-of-the-art orthogonal approximate message passing detectors, particularly at low velocities.
Abstract:In this article, we present a novel framework, named distributed task-oriented communication networks (DTCN), based on recent advances in multimodal semantic transmission and edge intelligence. In DTCN, the multimodal knowledge of semantic relays and the adaptive adjustment capability of edge intelligence can be integrated to improve task performance. Specifically, we propose the key techniques in the framework, such as semantic alignment and complement, a semantic relay scheme for deep joint source-channel relay coding, and collaborative device-server optimization and inference. Furthermore, a multimodal classification task is used as an example to demonstrate the benefits of the proposed DTCN over existing methods. Numerical results validate that DTCN can significantly improve the accuracy of classification tasks, even in harsh communication scenarios (e.g., low signal-to-noise regime), thanks to multimodal semantic relay and edge intelligence.
Abstract:To support complex communication scenarios in next-generation wireless communications, this paper focuses on a generalized MIMO (GMIMO) with practical assumptions, such as massive antennas, practical channel coding, arbitrary input distributions, and general right-unitarily-invariant channel matrices (covering Rayleigh fading, certain ill-conditioned and correlated channel matrices). The orthogonal/vector approximate message passing (OAMP/VAMP) receiver has been proved to be information-theoretically optimal in GMIMO, but it is limited to high-complexity LMMSE. To solve this problem, a low-complexity memory approximate message passing (MAMP) receiver has recently been shown to be Bayes optimal but limited to uncoded systems. Therefore, how to design a low-complexity and information-theoretically optimal receiver for GMIMO is still an open issue. To address this issue, this paper proposes an information-theoretically optimal MAMP receiver and investigates its achievable rate analysis and optimal coding principle. Specifically, due to the long-memory linear detection, state evolution (SE) for MAMP is intricately multidimensional and cannot be used directly to analyze its achievable rate. To avoid this difficulty, a simplified single-input single-output variational SE (VSE) for MAMP is developed by leveraging the SE fixed-point consistent property of MAMP and OAMP/VAMP. The achievable rate of MAMP is calculated using the VSE, and the optimal coding principle is established to maximize the achievable rate. On this basis, the information-theoretic optimality of MAMP is proved rigorously. Numerical results show that the finite-length performances of MAMP with practical optimized LDPC codes are 0.5-2.7 dB away from the associated constrained capacities. It is worth noting that MAMP can achieve the same performances as OAMP/VAMP with 0.4% of the time consumption for large-scale systems.
Abstract:Massive Machine-Type Communications (mMTC) features a massive number of low-cost user equipments (UEs) with sparse activity. Tailor-made for these features, grant-free random access (GF-RA) serves as an efficient access solution for mMTC. However, most existing GF-RA schemes rely on strict synchronization, which incurs excessive coordination burden for the low-cost UEs. In this work, we propose a receiver design for asynchronous GF-RA, and address the joint user activity detection (UAD) and channel estimation (CE) problem in the presence of asynchronization-induced inter-symbol interference. Specifically, the delay profile is exploited at the receiver to distinguish different UEs. However, a sample correlation problem in this receiver design impedes the factorization of the joint likelihood function, which complicates the UAD and CE problem. To address this correlation problem, we design a partially uni-directional (PUD) factor graph representation for the joint likelihood function. Building on this PUD factor graph, we further propose a PUD message passing based sparse Bayesian learning (SBL) algorithm for asynchronous UAD and CE (PUDMP-SBL-aUADCE). Our theoretical analysis shows that the PUDMP-SBL-aUADCE algorithm exhibits higher signal-to-interference-and-noise ratio (SINR) in the asynchronous case than in the synchronous case, i.e., the proposed receiver design can exploit asynchronization to suppress multi-user interference. In addition, considering potential timing error from the low-cost UEs, we investigate the impacts of imperfect delay profile, and reveal the advantages of adopting the SBL method in this case. Finally, extensive simulation results are provided to demonstrate the performance of the PUDMP-SBL-aUADCE algorithm.
Abstract:The generalized linear system (GLS) has been widely used in wireless communications to evaluate the effect of nonlinear preprocessing on receiver performance. Generalized approximation message passing (AMP) is a state-of-the-art algorithm for the signal recovery of GLS, but it was limited to measurement matrices with independent and identically distributed (IID) elements. To relax this restriction, generalized orthogonal/vector AMP (GOAMP/GVAMP) for unitarily-invariant measurement matrices was established, which has been proven to be replica Bayes optimal in uncoded GLS. However, the information-theoretic limit of GOAMP/GVAMP is still an open challenge for arbitrary input distributions due to its complex state evolution (SE). To address this issue, in this paper, we provide the achievable rate analysis of GOAMP/GVAMP in GLS, establishing its information-theoretic limit (i.e., maximum achievable rate). Specifically, we transform the fully-unfolded state evolution (SE) of GOAMP/GVAMP into an equivalent single-input single-output variational SE (VSE). Using the VSE and the mutual information and minimum mean-square error (I-MMSE) lemma, the achievable rate of GOAMP/GVAMP is derived. Moreover, the optimal coding principle for maximizing the achievable rate is proposed, based on which a kind of low-density parity-check (LDPC) code is designed. Numerical results verify the achievable rate advantages of GOAMP/GVAMP over the conventional maximum ratio combining (MRC) receiver based on the linearized model and the BER performance gains of the optimized LDPC codes (0.8~2.8 dB) compared to the existing methods.
Abstract:Image-text retrieval (ITR) is a challenging task in the field of multimodal information processing due to the semantic gap between different modalities. In recent years, researchers have made great progress in exploring the accurate alignment between image and text. However, existing works mainly focus on the fine-grained alignment between image regions and sentence fragments, which ignores the guiding significance of context background information. Actually, integrating the local fine-grained information and global context background information can provide more semantic clues for retrieval. In this paper, we propose a novel Hierarchical Graph Alignment Network (HGAN) for image-text retrieval. First, to capture the comprehensive multimodal features, we construct the feature graphs for the image and text modality respectively. Then, a multi-granularity shared space is established with a designed Multi-granularity Feature Aggregation and Rearrangement (MFAR) module, which enhances the semantic corresponding relations between the local and global information, and obtains more accurate feature representations for the image and text modalities. Finally, the ultimate image and text features are further refined through three-level similarity functions to achieve the hierarchical alignment. To justify the proposed model, we perform extensive experiments on MS-COCO and Flickr30K datasets. Experimental results show that the proposed HGAN outperforms the state-of-the-art methods on both datasets, which demonstrates the effectiveness and superiority of our model.
Abstract:Conventional multi-user multiple-input multiple-output (MU-MIMO) mainly focused on Gaussian signaling, independent and identically distributed (IID) channels, and a limited number of users. It will be laborious to cope with the heterogeneous requirements in next-generation wireless communications, such as various transmission data, complicated communication scenarios, and massive user access. Therefore, this paper studies a generalized MU-MIMO (GMU-MIMO) system with more practical constraints, i.e., non-Gaussian signaling, non-IID channel, and massive users and antennas. These generalized assumptions bring new challenges in theory and practice. For example, there is no accurate capacity analysis for GMU-MIMO. In addition, it is unclear how to achieve the capacity optimal performance with practical complexity. To address these challenges, a unified framework is proposed to derive the GMU-MIMO capacity and design a capacity optimal transceiver, which jointly considers encoding, modulation, detection, and decoding. Group asymmetry is developed to make a tradeoff between user rate allocation and implementation complexity. Specifically, the capacity region of group asymmetric GMU-MIMO is characterized by using the celebrated mutual information and minimum mean-square error (MMSE) lemma and the MMSE optimality of orthogonal approximate message passing (OAMP)/vector AMP (VAMP). Furthermore, a theoretically optimal multi-user OAMP/VAMP receiver and practical multi-user low-density parity-check (MU-LDPC) codes are proposed to achieve the capacity region of group asymmetric GMU-MIMO. Numerical results verify that the gaps between theoretical detection thresholds of the proposed framework with optimized MU-LDPC codes and QPSK modulation and the sum capacity of GMU-MIMO are about 0.2 dB. Moreover, their finite-length performances are about 1~2 dB away from the associated sum capacity.