Abstract:Generative foundation models can revolutionize the design of semantic communication (SemCom) systems allowing high fidelity exchange of semantic information at ultra low rates. In this work, a generative SemCom framework with pretrained foundation models is proposed, where both uncoded forward-with-error and coded discard-with-error schemes are developed for the semantic decoder. To characterize the impact of transmission reliability on the perceptual quality of the regenerated signal, their mathematical relationship is analyzed from a rate-distortion-perception perspective, which is proved to be non-decreasing. The semantic values are defined to measure the semantic information of multimodal semantic features accordingly. We also investigate semantic-aware power allocation problems aiming at power consumption minimization for ultra low rate and high fidelity SemComs. To solve these problems, two semantic-aware power allocation methods are proposed by leveraging the non-decreasing property of the perception-error relationship. Numerically, perception-error functions and semantic values of semantic data streams under both schemes for image tasks are obtained based on the Kodak dataset. Simulation results show that our proposed semanticaware method significantly outperforms conventional approaches, particularly in the channel-coded case (up to 90% power saving).
Abstract:Generative diffusion models (GDMs) have recently shown great success in synthesizing multimedia signals with high perceptual quality enabling highly efficient semantic communications in future wireless networks. In this paper, we develop an intent-aware generative semantic multicasting framework utilizing pre-trained diffusion models. In the proposed framework, the transmitter decomposes the source signal to multiple semantic classes based on the multi-user intent, i.e. each user is assumed to be interested in details of only a subset of the semantic classes. The transmitter then sends to each user only its intended classes, and multicasts a highly compressed semantic map to all users over shared wireless resources that allows them to locally synthesize the other classes, i.e. non-intended classes, utilizing pre-trained diffusion models. The signal retrieved at each user is thereby partially reconstructed and partially synthesized utilizing the received semantic map. This improves utilization of the wireless resources, with better preserving privacy of the non-intended classes. We design a communication/computation-aware scheme for per-class adaptation of the communication parameters, such as the transmission power and compression rate to minimize the total latency of retrieving signals at multiple receivers, tailored to the prevailing channel conditions as well as the users reconstruction/synthesis distortion/perception requirements. The simulation results demonstrate significantly reduced per-user latency compared with non-generative and intent-unaware multicasting benchmarks while maintaining high perceptual quality of the signals retrieved at the users.
Abstract:Recent advancements in diffusion models have made a significant breakthrough in generative modeling. The combination of the generative model and semantic communication (SemCom) enables high-fidelity semantic information exchange at ultra-low rates. A novel generative SemCom framework for image tasks is proposed, wherein pre-trained foundation models serve as semantic encoders and decoders for semantic feature extractions and image regenerations, respectively. The mathematical relationship between the transmission reliability and the perceptual quality of the regenerated image and the semantic values of semantic features are modeled, which are obtained by conducting numerical simulations on the Kodak dataset. We also investigate the semantic-aware power allocation problem, with the objective of minimizing the total power consumption while guaranteeing semantic performance. To solve this problem, two semanticaware power allocation methods are proposed by constraint decoupling and bisection search, respectively. Numerical results show that the proposed semantic-aware methods demonstrate superior performance compared to the conventional one in terms of total power consumption.
Abstract:Over-the-air computation (AirComp) is a promising technology converging communication and computation over wireless networks, which can be particularly effective in model training, inference, and more emerging edge intelligence applications. AirComp relies on uncoded transmission of individual signals, which are added naturally over the multiple access channel thanks to the superposition property of the wireless medium. Despite significantly improved communication efficiency, how to accommodate AirComp in the existing and future digital communication networks, that are based on discrete modulation schemes, remains a challenge. This paper proposes a massive digital AirComp (MD-AirComp) scheme, that leverages an unsourced massive access protocol, to enhance compatibility with both current and next-generation wireless networks. MD-AirComp utilizes vector quantization to reduce the uplink communication overhead, and employs shared quantization and modulation codebooks. At the receiver, we propose a near-optimal approximate message passing-based algorithm to compute the model aggregation results from the superposed sequences, which relies on estimating the number of devices transmitting each code sequence, rather than trying to decode the messages of individual transmitters. We apply MD-AirComp to the federated edge learning (FEEL), and show that it significantly accelerates FEEL convergence compared to state-of-the-art while using the same amount of communication resources. To support further research and ensure reproducibility, we have made our code available at https://github.com/liqiao19/MD-AirComp.
Abstract:Generative foundation AI models have recently shown great success in synthesizing natural signals with high perceptual quality using only textual prompts and conditioning signals to guide the generation process. This enables semantic communications at extremely low data rates in future wireless networks. In this paper, we develop a latency-aware semantic communications framework with pre-trained generative models. The transmitter performs multi-modal semantic decomposition on the input signal and transmits each semantic stream with the appropriate coding and communication schemes based on the intent. For the prompt, we adopt a re-transmission-based scheme to ensure reliable transmission, and for the other semantic modalities we use an adaptive modulation/coding scheme to achieve robustness to the changing wireless channel. Furthermore, we design a semantic and latency-aware scheme to allocate transmission power to different semantic modalities based on their importance subjected to semantic quality constraints. At the receiver, a pre-trained generative model synthesizes a high fidelity signal using the received multi-stream semantics. Simulation results demonstrate ultra-low-rate, low-latency, and channel-adaptive semantic communications.
Abstract:In this letter, we investigate the signal-to-interference-plus-noise-ratio (SINR) maximization problem in a multi-user massive multiple-input-multiple-output (massive MIMO) system enabled with multiple reconfigurable intelligent surfaces (RISs). We examine two zero-forcing (ZF) beamforming approaches for interference management namely BS-UE-ZF and BS-RIS-ZF that enforce the interference to zero at the users (UEs) and the RISs, respectively.Then, for each case, we resolve the SINR maximization problem to find the optimal phase shifts of the elements of the RISs. Also, we evaluate the asymptotic expressions for the optimal phase shifts and the maximum SINRs when the number of the base station (BS) antennas tends to infinity. We show that if the channels of the RIS elements are independent and the number of the BS antennas tends to infinity, random phase shifts achieve the maximum SINR using the BS-UE-ZF beamforming approach. The simulation results illustrate that by employing the BS-RIS-ZF beamforming approach, the asymptotic expressions of the phase shifts and maximum SINRs achieve the rate obtained by the optimal phase shifts even for a small number of the BS antennas.
Abstract:In this paper, the problem of drone-assisted collaborative learning is considered. In this scenario, swarm of intelligent wireless devices train a shared neural network (NN) model with the help of a drone. Using its sensors, each device records samples from its environment to gather a local dataset for training. The training data is severely heterogeneous as various devices have different amount of data and sensor noise level. The intelligent devices iteratively train the NN on their local datasets and exchange the model parameters with the drone for aggregation. For this system, the convergence rate of collaborative learning is derived while considering data heterogeneity, sensor noise levels, and communication errors, then, the drone trajectory that maximizes the final accuracy of the trained NN is obtained. The proposed trajectory optimization approach is aware of both the devices data characteristics (i.e., local dataset size and noise level) and their wireless channel conditions, and significantly improves the convergence rate and final accuracy in comparison with baselines that only consider data characteristics or channel conditions. Compared to state-of-the-art baselines, the proposed approach achieves an average 3.85% and 3.54% improvement in the final accuracy of the trained NN on benchmark datasets for image recognition and semantic segmentation tasks, respectively. Moreover, the proposed framework achieves a significant speedup in training, leading to an average 24% and 87% saving in the drone hovering time, communication overhead, and battery usage, respectively for these tasks.
Abstract:In this paper, federated learning (FL) over wireless networks is investigated. In each communication round, a subset of devices is selected to participate in the aggregation with limited time and energy. In order to minimize the convergence time, global loss and latency are jointly considered in a Stackelberg game based framework. Specifically, age of information (AoI) based device selection is considered at leader-level as a global loss minimization problem, while sub-channel assignment, computational resource allocation, and power allocation are considered at follower-level as a latency minimization problem. By dividing the follower-level problem into two sub-problems, the best response of the follower is obtained by a monotonic optimization based resource allocation algorithm and a matching based sub-channel assignment algorithm. By deriving the upper bound of convergence rate, the leader-level problem is reformulated, and then a list based device selection algorithm is proposed to achieve Stackelberg equilibrium. Simulation results indicate that the proposed device selection scheme outperforms other schemes in terms of the global loss, and the developed algorithms can significantly decrease the time consumption of computation and communication.
Abstract:We present a new deep-neural-network (DNN) based error correction code for fading channels with output feedback, called deep SNR-robust feedback (DRF) code. At the encoder, parity symbols are generated by a long short term memory (LSTM) network based on the message as well as the past forward channel outputs observed by the transmitter in a noisy fashion. The decoder uses a bi-directional LSTM architecture along with a signal to noise ratio (SNR)-aware attention NN to decode the message. The proposed code overcomes two major shortcomings of the previously proposed DNN-based codes over channels with passive output feedback: (i) the SNR-aware attention mechanism at the decoder enables reliable application of the same trained NN over a wide range of SNR values; (ii) curriculum training with batch-size scheduling is used to speed up and stabilize training while improving the SNR-robustness of the resulting code. We show that the DRF codes significantly outperform state-of-the-art in terms of both the SNR-robustness and the error rate in additive white Gaussian noise (AWGN) channel with feedback. In fading channels with perfect phase compensation at the receiver, DRF codes learn to efficiently exploit knowledge of the instantaneous fading amplitude (which is available to the encoder through feedback) to reduce the overhead and complexity associated with channel estimation at the decoder. Finally, we show the effectiveness of DRF codes in multicast channels with feedback, where linear feedback codes are known to be strictly suboptimal.
Abstract:A new deep-neural-network (DNN) based error correction encoder architecture for channels with feedback, called Deep Extended Feedback (DEF), is presented in this paper. The encoder in the DEF architecture transmits an information message followed by a sequence of parity symbols which are generated based on the message as well as the observations of the past forward channel outputs sent to the transmitter through a feedback channel. DEF codes generalize Deepcode [1] in several ways: parity symbols are generated based on forward-channel output observations over longer time intervals in order to provide better error correction capability; and high-order modulation formats are deployed in the encoder so as to achieve increased spectral efficiency. Performance evaluations show that DEF codes have better performance compared to other DNN-based codes for channels with feedback.