Abstract:This paper proposes a novel digital deep joint source-channel coding (DeepJSCC) framework that achieves robust performance across diverse communication environments without requiring extensive retraining and prior knowledge of communication environments. Traditional digital DeepJSCC techniques often face challenges in adapting to various communication environments, as they require significant training overhead and large amounts of communication data to develop either multiple specialized models or a single generalized model, in pre-defined communication environments. To address this challenge, in our framework, an error-adaptive blind training strategy is devised, which eliminates the need for prior knowledge of communication environments. This is achieved by modeling the relationship between the encoder's output and the decoder's input using binary symmetric channels, and optimizing bit-flip probabilities by treating them as trainable parameters. In our framework, a training-aware communication strategy is also presented, which dynamically selects the optimal encoder-decoder pair and transmission parameters based on current channel conditions. In particular, in this strategy, an adaptive power and modulation control method is developed to minimize the total transmission power, while maintaining high task performance. Simulation results demonstrate that our framework outperforms existing DeepJSCC methods, achieving higher peak signal-to-noise ratio, lower power consumption, and requiring significantly fewer encoder-decoder pairs for adaptation.
Abstract:This paper studies a hybrid language model (HLM) architecture that integrates a small language model (SLM) operating on a mobile device with a large language model (LLM) hosted at the base station (BS) of a wireless network. The HLM token generation process follows the speculative inference principle: the SLM's vocabulary distribution is uploaded to the LLM, which either accepts or rejects it, with rejected tokens being resampled by the LLM. While this approach ensures alignment between the vocabulary distributions of the SLM and LLM, it suffers from low token throughput due to uplink transmission and the computation costs of running both language models. To address this, we propose a novel HLM structure coined Uncertainty-aware HLM (U-HLM), wherein the SLM locally measures its output uncertainty, and skips both uplink transmissions and LLM operations for tokens that are likely to be accepted. This opportunistic skipping is enabled by our empirical finding of a linear correlation between the SLM's uncertainty and the LLM's rejection probability. We analytically derive the uncertainty threshold and evaluate its expected risk of rejection. Simulations show that U-HLM reduces uplink transmissions and LLM computation by 45.93%, while achieving up to 97.54% of the LLM's inference accuracy and 2.54$\times$ faster token throughput than HLM without skipping.
Abstract:Low Probability of Detection (LPD) communication aims to obscure the presence of radio frequency (RF) signals to evade surveillance. In the context of mobile surveillance utilizing unmanned aerial vehicles (UAVs), achieving LPD communication presents significant challenges due to the UAVs' rapid and continuous movements, which are characterized by unknown nonlinear dynamics. Therefore, accurately predicting future locations of UAVs is essential for enabling real-time LPD communication. In this paper, we introduce a novel framework termed predictive covert communication, aimed at minimizing detectability in terrestrial ad-hoc networks under multi-UAV surveillance. Our data-driven method synergistically integrates graph neural networks (GNN) with Koopman theory to model the complex interactions within a multi-UAV network and facilitating long-term predictions by linearizing the dynamics, even with limited historical data. Extensive simulation results substantiate that the predicted trajectories using our method result in at least 63%-75% lower probability of detection when compared to well-known state-of-the-art baseline approaches, showing promise in enabling low-latency covert operations in practical scenarios.
Abstract:Six-dimensional movable antenna (6DMA) is an innovative technology to improve wireless network capacity by adjusting 3D positions and 3D rotations of antenna surfaces based on channel spatial distribution. However, the existing works on 6DMA have assumed a central processing unit (CPU) to jointly process the signals of all 6DMA surfaces to execute various tasks. This inevitably incurs prohibitively high processing cost for channel estimation. Therefore, we propose a distributed 6DMA processing architecture to reduce processing complexity of CPU by equipping each 6DMA surface with a local processing unit (LPU). In particular, we unveil for the first time a new \textbf{\textit{directional sparsity}} property of 6DMA channels, where each user has significant channel gains only for a (small) subset of 6DMA position-rotation pairs, which can receive direct/reflected signals from users. In addition, we propose a practical three-stage protocol for the 6DMA-equipped base station (BS) to conduct statistical CSI acquisition for all 6DMA candidate positions/rotations, 6DMA position/rotation optimization, and instantaneous channel estimation for user data transmission with optimized 6DMA positions/rotations. Specifically, the directional sparsity is leveraged to develop distributed algorithms for joint sparsity detection and channel power estimation, as well as for directional sparsity-aided instantaneous channel estimation. Using the estimated channel power, we develop a channel power-based optimization algorithm to maximize the ergodic sum rate of the users by optimizing the antenna positions/rotations. Simulation results show that our channel estimation algorithms are more accurate than benchmarks with lower pilot overhead, and our optimization outperforms fluid/movable antennas optimized only in two dimensions (2D), even when the latter have perfect instantaneous CSI.
Abstract:The Koopman autoencoder, a data-driven technique, has gained traction for modeling nonlinear dynamics using deep learning methods in recent years. Given the linear characteristics inherent to the Koopman operator, controlling its eigenvalues offers an opportunity to enhance long-term prediction performance, a critical task for forecasting future trends in time-series datasets with long-term behaviors. However, controlling eigenvalues is challenging due to high computational complexity and difficulties in managing them during the training process. To tackle this issue, we propose leveraging the singular value decomposition (SVD) of the Koopman matrix to adjust the singular values for better long-term prediction. Experimental results demonstrate that, during training, the loss term for singular values effectively brings the eigenvalues close to the unit circle, and the proposed approach outperforms existing baseline methods for long-term prediction tasks.
Abstract:In computer vision, the vision transformer (ViT) has increasingly superseded the convolutional neural network (CNN) for improved accuracy and robustness. However, ViT's large model sizes and high sample complexity make it difficult to train on resource-constrained edge devices. Split learning (SL) emerges as a viable solution, leveraging server-side resources to train ViTs while utilizing private data from distributed devices. However, SL requires additional information exchange for weight updates between the device and the server, which can be exposed to various attacks on private training data. To mitigate the risk of data breaches in classification tasks, inspired from the CutMix regularization, we propose a novel privacy-preserving SL framework that injects Gaussian noise into smashed data and mixes randomly chosen patches of smashed data across clients, coined DP-CutMixSL. Our analysis demonstrates that DP-CutMixSL is a differentially private (DP) mechanism that strengthens privacy protection against membership inference attacks during forward propagation. Through simulations, we show that DP-CutMixSL improves privacy protection against membership inference attacks, reconstruction attacks, and label inference attacks, while also improving accuracy compared to DP-SL and DP-MixSL.
Abstract:In the new paradigm of semantic communication (SC), the focus is on delivering meanings behind bits by extracting semantic information from raw data. Recent advances in data-to-text models facilitate language-oriented SC, particularly for text-transformed image communication via image-to-text (I2T) encoding and text-to-image (T2I) decoding. However, although semantically aligned, the text is too coarse to precisely capture sophisticated visual features such as spatial locations, color, and texture, incurring a significant perceptual difference between intended and reconstructed images. To address this limitation, in this paper, we propose a novel language-oriented SC framework that communicates both text and a compressed image embedding and combines them using a latent diffusion model to reconstruct the intended image. Experimental results validate the potential of our approach, which transmits only 2.09\% of the original image size while achieving higher perceptual similarities in noisy communication channels compared to a baseline SC method that communicates only through text.The code is available at https://github.com/ispamm/Img2Img-SC/ .
Abstract:In recent years, novel communication strategies have emerged to face the challenges that the increased number of connected devices and the higher quality of transmitted information are posing. Among them, semantic communication obtained promising results especially when combined with state-of-the-art deep generative models, such as large language or diffusion models, able to regenerate content from extremely compressed semantic information. However, most of these approaches focus on single-user scenarios processing the received content at the receiver on top of conventional communication systems. In this paper, we propose to go beyond these methods by developing a novel generative semantic communication framework tailored for multi-user scenarios. This system assigns the channel to users knowing that the lost information can be filled in with a diffusion model at the receivers. Under this innovative perspective, OFDMA systems should not aim to transmit the largest part of information, but solely the bits necessary to the generative model to semantically regenerate the missing ones. The thorough experimental evaluation shows the capabilities of the novel diffusion model and the effectiveness of the proposed framework, leading towards a GenAI-based next generation of communications.
Abstract:In this article, we propose a multi-agent deep reinforcement learning (MADRL) framework to train a multiple access protocol for downlink low earth orbit (LEO) satellite networks. By improving the existing learned protocol, emergent random access channel (eRACH), our proposed method, coined centralized and compressed emergent signaling for eRACH (Ce2RACH), can mitigate inter-satellite interference by exchanging additional signaling messages jointly learned through the MADRL training process. Simulations demonstrate that Ce2RACH achieves up to 36.65% higher network throughput compared to eRACH, while the cost of signaling messages increase linearly with the number of users.
Abstract:In this work, we compare emergent communication (EC) built upon multi-agent deep reinforcement learning (MADRL) and language-oriented semantic communication (LSC) empowered by a pre-trained large language model (LLM) using human language. In a multi-agent remote navigation task, with multimodal input data comprising location and channel maps, it is shown that EC incurs high training cost and struggles when using multimodal data, whereas LSC yields high inference computing cost due to the LLM's large size. To address their respective bottlenecks, we propose a novel framework of language-guided EC (LEC) by guiding the EC training using LSC via knowledge distillation (KD). Simulations corroborate that LEC achieves faster travel time while avoiding areas with poor channel conditions, as well as speeding up the MADRL training convergence by up to 61.8% compared to EC.