Abstract:Adapting foundation models for specific purposes has become a standard approach to build machine learning systems for downstream applications. Yet, it is an open question which mechanisms take place during adaptation. Here we develop a new Sparse Autoencoder (SAE) for the CLIP vision transformer, named PatchSAE, to extract interpretable concepts at granular levels (e.g. shape, color, or semantics of an object) and their patch-wise spatial attributions. We explore how these concepts influence the model output in downstream image classification tasks and investigate how recent state-of-the-art prompt-based adaptation techniques change the association of model inputs to these concepts. While activations of concepts slightly change between adapted and non-adapted models, we find that the majority of gains on common adaptation tasks can be explained with the existing concepts already present in the non-adapted foundation model. This work provides a concrete framework to train and use SAEs for Vision Transformers and provides insights into explaining adaptation mechanisms.
Abstract:This study introduces an innovative approach for adaptive power allocation in Non-Orthogonal Multiple Access (NOMA) systems, enhanced by the integration of spaceborne and terrestrial signals through a Reconfigurable Intelligent Surface (RIS). We develop an adaptive mechanism to adjust the power distribution between spaceborne and terrestrial signals according to variations in environmental conditions and elevation angles. This mechanism employs a sophisticated transition model that combines Gaussian Mixture Models with Log-Normal distributions to adaptively counteract the detrimental impacts of atmospheric attenuation and urban shadowing. These adaptive power adjustments significantly enhance system capacity, particularly improving the Signal-to-Interference-plus-Noise Ratio under diverse operational scenarios. Simulation studies confirm the efficacy of our method within an RIS-enhanced framework, showing an approximate 20\% increase in system capacity through optimized power management between spaceborne and terrestrial signals.
Abstract:Low Probability of Detection (LPD) communication aims to obscure the presence of radio frequency (RF) signals to evade surveillance. In the context of mobile surveillance utilizing unmanned aerial vehicles (UAVs), achieving LPD communication presents significant challenges due to the UAVs' rapid and continuous movements, which are characterized by unknown nonlinear dynamics. Therefore, accurately predicting future locations of UAVs is essential for enabling real-time LPD communication. In this paper, we introduce a novel framework termed predictive covert communication, aimed at minimizing detectability in terrestrial ad-hoc networks under multi-UAV surveillance. Our data-driven method synergistically integrates graph neural networks (GNN) with Koopman theory to model the complex interactions within a multi-UAV network and facilitating long-term predictions by linearizing the dynamics, even with limited historical data. Extensive simulation results substantiate that the predicted trajectories using our method result in at least 63%-75% lower probability of detection when compared to well-known state-of-the-art baseline approaches, showing promise in enabling low-latency covert operations in practical scenarios.
Abstract:The Koopman autoencoder, a data-driven technique, has gained traction for modeling nonlinear dynamics using deep learning methods in recent years. Given the linear characteristics inherent to the Koopman operator, controlling its eigenvalues offers an opportunity to enhance long-term prediction performance, a critical task for forecasting future trends in time-series datasets with long-term behaviors. However, controlling eigenvalues is challenging due to high computational complexity and difficulties in managing them during the training process. To tackle this issue, we propose leveraging the singular value decomposition (SVD) of the Koopman matrix to adjust the singular values for better long-term prediction. Experimental results demonstrate that, during training, the loss term for singular values effectively brings the eigenvalues close to the unit circle, and the proposed approach outperforms existing baseline methods for long-term prediction tasks.
Abstract:In order to extract governing equations from time-series data, various approaches are proposed. Among those, sparse identification of nonlinear dynamics (SINDy) stands out as a successful method capable of modeling governing equations with a minimal number of terms, utilizing the principles of compressive sensing. This feature, which relies on a small number of terms, is crucial for interpretability. The effectiveness of SINDy hinges on the choice of candidate functions within its dictionary to extract governing equations of dynamical systems. A larger dictionary allows for more terms, enhancing the quality of approximations. However, the computational complexity scales with dictionary size, rendering SINDy less suitable for high-dimensional datasets, even though it has been successfully applied to low-dimensional datasets. To address this challenge, we introduce iterative SINDy in this paper, where the dictionary undergoes expansion and compression through iterations. We also conduct an analysis of the convergence properties of iterative SINDy. Simulation results validate that iterative SINDy can achieve nearly identical performance to SINDy, while significantly reducing computational complexity. Notably, iterative SINDy demonstrates effectiveness with high-dimensional time-series data without incurring the prohibitively high computational cost associated with SINDy.
Abstract:Low Earth orbit (LEO) satellites play a crucial role in providing global connectivity for non-terrestrial networks (NTNs) and supporting various Internet-of-Remote-Things (IoRT) applications. Each LEO satellite functions as a relay node in the sky, employing store-and-forward transmission strategies that necessitate the use of buffers. However, due to the finite size of these buffers, occurrences of buffer overflow leading to packet loss are inevitable. In this paper, we demonstrate how inter-satellite links (ISLs) can mitigate the probability of buffer overflow. Specifically, we propose an approach to reallocate packets among LEO satellites via ISLs to minimize the occurrence of buffer overflow events. Consequently, the implementation of ISLs can lead to a more reliable satellite network, enabling efficient packet reallocation to reduce the probability of buffer overflow.
Abstract:In the new paradigm of semantic communication (SC), the focus is on delivering meanings behind bits by extracting semantic information from raw data. Recent advances in data-to-text models facilitate language-oriented SC, particularly for text-transformed image communication via image-to-text (I2T) encoding and text-to-image (T2I) decoding. However, although semantically aligned, the text is too coarse to precisely capture sophisticated visual features such as spatial locations, color, and texture, incurring a significant perceptual difference between intended and reconstructed images. To address this limitation, in this paper, we propose a novel language-oriented SC framework that communicates both text and a compressed image embedding and combines them using a latent diffusion model to reconstruct the intended image. Experimental results validate the potential of our approach, which transmits only 2.09\% of the original image size while achieving higher perceptual similarities in noisy communication channels compared to a baseline SC method that communicates only through text.The code is available at https://github.com/ispamm/Img2Img-SC/ .
Abstract:In recent years, novel communication strategies have emerged to face the challenges that the increased number of connected devices and the higher quality of transmitted information are posing. Among them, semantic communication obtained promising results especially when combined with state-of-the-art deep generative models, such as large language or diffusion models, able to regenerate content from extremely compressed semantic information. However, most of these approaches focus on single-user scenarios processing the received content at the receiver on top of conventional communication systems. In this paper, we propose to go beyond these methods by developing a novel generative semantic communication framework tailored for multi-user scenarios. This system assigns the channel to users knowing that the lost information can be filled in with a diffusion model at the receivers. Under this innovative perspective, OFDMA systems should not aim to transmit the largest part of information, but solely the bits necessary to the generative model to semantically regenerate the missing ones. The thorough experimental evaluation shows the capabilities of the novel diffusion model and the effectiveness of the proposed framework, leading towards a GenAI-based next generation of communications.
Abstract:In this article, we propose a multi-agent deep reinforcement learning (MADRL) framework to train a multiple access protocol for downlink low earth orbit (LEO) satellite networks. By improving the existing learned protocol, emergent random access channel (eRACH), our proposed method, coined centralized and compressed emergent signaling for eRACH (Ce2RACH), can mitigate inter-satellite interference by exchanging additional signaling messages jointly learned through the MADRL training process. Simulations demonstrate that Ce2RACH achieves up to 36.65% higher network throughput compared to eRACH, while the cost of signaling messages increase linearly with the number of users.
Abstract:While deep generative models are showing exciting abilities in computer vision and natural language processing, their adoption in communication frameworks is still far underestimated. These methods are demonstrated to evolve solutions to classic communication problems such as denoising, restoration, or compression. Nevertheless, generative models can unveil their real potential in semantic communication frameworks, in which the receiver is not asked to recover the sequence of bits used to encode the transmitted (semantic) message, but only to regenerate content that is semantically consistent with the transmitted message. Disclosing generative models capabilities in semantic communication paves the way for a paradigm shift with respect to conventional communication systems, which has great potential to reduce the amount of data traffic and offers a revolutionary versatility to novel tasks and applications that were not even conceivable a few years ago. In this paper, we present a unified perspective of deep generative models in semantic communication and we unveil their revolutionary role in future communication frameworks, enabling emerging applications and tasks. Finally, we analyze the challenges and opportunities to face to develop generative models specifically tailored for communication systems.