Abstract:Semantic communication, augmented by knowledge bases (KBs), offers substantial reductions in transmission overhead and resilience to errors. However, existing methods predominantly rely on end-to-end training to construct KBs, often failing to fully capitalize on the rich information available at communication devices. Motivated by the growing convergence of sensing and communication, we introduce a novel Position-Aided Semantic Communication (PASC) framework, which integrates localization into semantic transmission. This framework is particularly designed for position-based image communication, such as real-time uploading of outdoor camera-view images. By utilizing the position, the framework retrieves corresponding maps, and then an advanced foundation model (FM)-driven view generator is employed to synthesize images closely resembling the target images. The PASC framework further leverages the FM to fuse the synthesized image with deviations from the real one, enhancing semantic reconstruction. Notably, the framework is highly flexible, capable of adapting to dynamic content and fluctuating channel conditions through a novel FM-based parameter optimization strategy. Additionally, the challenges of real-time deployment are addressed, with the development of a hardware testbed to validate the framework. Simulations and real-world tests demonstrate that the proposed PASC approach not only significantly boosts transmission efficiency, but also remains robust in diverse and evolving transmission scenarios.
Abstract:Along with the prosperity of generative artificial intelligence (AI), its potential for solving conventional challenges in wireless communications has also surfaced. Inspired by this trend, we investigate the application of the advanced diffusion models (DMs), a representative class of generative AI models, to high dimensional wireless channel estimation. By capturing the structure of multiple-input multiple-output (MIMO) wireless channels via a deep generative prior encoded by DMs, we develop a novel posterior inference method for channel reconstruction. We further adapt the proposed method to recover channel information from low-resolution quantized measurements. Additionally, to enhance the over-the-air viability, we integrate the DM with the unsupervised Stein's unbiased risk estimator to enable learning from noisy observations and circumvent the requirements for ground truth channel data that is hardly available in practice. Results reveal that the proposed estimator achieves high-fidelity channel recovery while reducing estimation latency by a factor of 10 compared to state-of-the-art schemes, facilitating real-time implementation. Moreover, our method outperforms existing estimators while reducing the pilot overhead by half, showcasing its scalability to ultra-massive antenna arrays.
Abstract:Semantic communication has undergone considerable evolution due to the recent rapid development of artificial intelligence (AI), significantly enhancing both communication robustness and efficiency. Despite these advancements, most current semantic communication methods for image transmission pay little attention to the differing importance of objects and backgrounds in images. To address this issue, we propose a novel scheme named ASCViT-JSCC, which utilizes vision transformers (ViTs) integrated with an orthogonal frequency division multiplexing (OFDM) system. This scheme adaptively allocates bandwidth for objects and backgrounds in images according to the importance order of different parts determined by object detection of you only look once version 5 (YOLOv5) and feature points detection of scale invariant feature transform (SIFT). Furthermore, the proposed scheme adheres to digital modulation standards by incorporating quantization modules. We validate this approach through an over-the-air (OTA) testbed named intelligent communication prototype validation platform (ICP) based on a software-defined radio (SDR) and NVIDIA embedded kits. Our findings from both simulations and practical measurements show that ASCViT-JSCC significantly preserves objects in images and enhances reconstruction quality compared to existing methods.
Abstract:Satellite communications can provide massive connections and seamless coverage, but they also face several challenges, such as rain attenuation, long propagation delays, and co-channel interference. To improve transmission efficiency and address severe scenarios, semantic communication has become a popular choice, particularly when equipped with foundation models (FMs). In this study, we introduce an FM-based semantic satellite communication framework, termed FMSAT. This framework leverages FM-based segmentation and reconstruction to significantly reduce bandwidth requirements and accurately recover semantic features under high noise and interference. Considering the high speed of satellites, an adaptive encoder-decoder is proposed to protect important features and avoid frequent retransmissions. Meanwhile, a well-received image can provide a reference for repairing damaged images under sudden attenuation. Since acknowledgment feedback is subject to long propagation delays when retransmission is unavoidable, a novel error detection method is proposed to roughly detect semantic errors at the regenerative satellite. With the proposed detectors at both the satellite and the gateway, the quality of the received images can be ensured. The simulation results demonstrate that the proposed method can significantly reduce bandwidth requirements, adapt to complex satellite scenarios, and protect semantic information with an acceptable transmission delay.
Abstract:Foundation models (FMs), including large language models, have become increasingly popular due to their wide-ranging applicability and ability to understand human-like semantics. While previous research has explored the use of FMs in semantic communications to improve semantic extraction and reconstruction, the impact of these models on different system levels, considering computation and memory complexity, requires further analysis. This study focuses on integrating FMs at the effectiveness, semantic, and physical levels, using universal knowledge to profoundly transform system design. Additionally, it examines the use of compact models to balance performance and complexity, comparing three separate approaches that employ FMs. Ultimately, the study highlights unresolved issues in the field that need addressing.
Abstract:Semantic communication significantly reduces required bandwidth by understanding semantic meaning of the transmitted. However, current deep learning-based semantic communication methods rely on joint source-channel coding design and end-to-end training, which limits their adaptability to new physical channels and user requirements. Reconfigurable intelligent surfaces (RIS) offer a solution by customizing channels in different environments. In this study, we propose the RIS-SC framework, which allocates semantic contents with varying levels of RIS assistance to satisfy the changing user requirements. It takes into account user movement and line-of-sight obstructions, enabling the RIS resource to protect important semantics in challenging channel conditions. The simulation results indicate reasonable task performance, but some semantic parts that have no effect on task performances are abandoned under severe channel conditions. To address this issue, a reconstruction method is also introduced to improve visual acceptance by inferring those missing semantic parts. Furthermore, the framework can adjust RIS resources in friendly channel conditions to save and allocate them efficiently among multiple users. Simulation results demonstrate the adaptability and efficiency of the RIS-SC framework across diverse channel conditions and user requirements.
Abstract:Semantic communication has become a popular research area due its high spectrum efficiency and error-correction performance. Some studies use deep learning to extract semantic features, which usually form end-to-end semantic communication systems and are hard to address the varying wireless environments. Therefore, the novel semantic-based coding methods and performance metrics have been investigated and the designed semantic systems consist of various modules as in the conventional communications but with improved functions. This article discusses recent achievements in the state-of-art semantic communications exploiting the conventional modules in wireless systems. We demonstrate through two examples that the traditional hybrid automatic repeat request and modulation methods can be redesigned for novel semantic coding and metrics to further improve the performance of wireless semantic communications. At the end of this article, some open issues are identified.
Abstract:Video conferencing has become a popular mode of meeting even if it consumes considerable communication resources. Conventional video compression causes resolution reduction under limited bandwidth. Semantic video conferencing maintains high resolution by transmitting some keypoints to represent motions because the background is almost static, and the speakers do not change often. However, the study on the impact of the transmission errors on keypoints is limited. In this paper, we initially establish a basal semantic video conferencing (SVC) network, which dramatically reduces transmission resources while only losing detailed expressions. The transmission errors in SVC only lead to a changed expression, whereas those in the conventional methods destroy pixels directly. However, the conventional error detector, such as the cyclic redundancy check, cannot reflect the degree of expression changes. To overcome this issue, we develop an incremental redundancy hybrid automatic repeat-request (IR-HARQ) framework for the varying channels (SVC-HARQ) incorporating a novel semantic error detector. The SVC-HARQ has flexibility in bit consumption and achieves good performance. In addition, SVC-CSI is designed for channel state information (CSI) feedback to allocate the keypoint transmission and enhance the performance dramatically. Simulation shows that the proposed wireless semantic communication system can significantly improve the transmission efficiency.This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Abstract:Recently, semantic communication has been brought to the forefront because of its great success in deep learning (DL), especially Transformer. Even if semantic communication has been successfully applied in the sentence transmission to reduce semantic errors, existing architecture is usually fixed in the codeword length and is inefficient and inflexible for the varying sentence length. In this paper, we exploit hybrid automatic repeat request (HARQ) to reduce semantic transmission error further. We first combine semantic coding (SC) with Reed Solomon (RS) channel coding and HARQ, called SC-RS-HARQ, which exploits the superiority of the SC and the reliability of the conventional methods successfully. Although the SC-RS-HARQ is easily applied in the existing HARQ systems, we also develop an end-to-end architecture, called SCHARQ, to pursue the performance further. Numerical results demonstrate that SCHARQ significantly reduces the required number of bits for sentence semantic transmission and sentence error rate. Finally, we attempt to replace error detection from cyclic redundancy check to a similarity detection network called Sim32 to allow the receiver to reserve the wrong sentences with similar semantic information and to save transmission resources.
Abstract:Orthogonal frequency division multiplexing (OFDM) is one of the key technologies that are widely applied in current communication systems. Recently, artificial intelligence (AI)-aided OFDM receivers have been brought to the forefront to break the bottleneck of the traditional OFDM systems. In this paper, we investigate two AI-aided OFDM receivers, data-driven fully connected-deep neural network (FC-DNN) receiver and model-driven ComNet receiver, respectively. We first study their performance under different channel models through simulation and then establish a real-time video transmission system using a 5G rapid prototyping (RaPro) system for over-the-air (OTA) test. To address the performance gap between the simulation and the OTA test caused by the discrepancy between the channel model for offline training and real environments, we develop a novel online training strategy, called SwitchNet receiver. The SwitchNet receiver is with a flexible and extendable architecture and can adapts to real channel by training one parameter online. The OTA test verifies its feasibility and robustness to real environments and indicates its potential for future communications systems. At the end of this paper, we discuss some challenges to inspire future research.