Abstract:The received in-phase and quadrature (I/Q) baseband signals inherently encode physical-layer and channel characteristics of wireless links. Learning robust and transferable representations directly from such raw signals, however, remains challenging due to heterogeneous communication systems, diverse propagation environments, and limited labeled data. To address this, we present LWM-Spectro, a transformer-based foundation model pretrained on large-scale I/Q data represented as time-frequency spectrograms. The model leverages self-supervised masked modeling, contrastive learning, and a mixture-of-experts (MoE) architecture to learn general-purpose wireless representations. These representations transfer effectively to downstream tasks such as modulation classification and joint SNR/mobility recognition, even with minimal supervision. Across tasks, LWM-Spectro consistently outperforms state-of-the-art deep learning baselines in both few-shot and data-rich regimes, providing a unified foundation for wireless learning.
Abstract:This paper introduces a task- and model-aware framework for measuring similarity between wireless datasets, enabling applications such as dataset selection/augmentation, simulation-to-real (sim2real) comparison, task-specific synthetic data generation, and informing decisions on model training/adaptation to new deployments. We evaluate candidate dataset distance metrics by how well they predict cross-dataset transferability: if two datasets have a small distance, a model trained on one should perform well on the other. We apply the framework on an unsupervised task, channel state information (CSI) compression, using autoencoders. Using metrics based on UMAP embeddings, combined with Wasserstein and Euclidean distances, we achieve Pearson correlations exceeding 0.85 between dataset distances and train-on-one/test-on-another task performance. We also apply the framework to a supervised beam prediction in the downlink using convolutional neural networks. For this task, we derive a label-aware distance by integrating supervised UMAP and penalties for dataset imbalance. Across both tasks, the resulting distances outperform traditional baselines and consistently exhibit stronger correlations with model transferability, supporting task-relevant comparisons between wireless datasets.



Abstract:Effective channel estimation in sparse and high-dimensional environments is essential for next-generation wireless systems, particularly in large-scale MIMO deployments. This paper introduces a novel framework that leverages digital twins (DTs) as priors to enable efficient zone-specific subspace-based channel estimation (CE). Subspace-based CE significantly reduces feedback overhead by focusing on the dominant channel components, exploiting sparsity in the angular domain while preserving estimation accuracy. While DT channels may exhibit inaccuracies, their coarse-grained subspaces provide a powerful starting point, reducing the search space and accelerating convergence. The framework employs a two-step clustering process on the Grassmann manifold, combined with reinforcement learning (RL), to iteratively calibrate subspaces and align them with real-world counterparts. Simulations show that digital twins not only enable near-optimal performance but also enhance the accuracy of subspace calibration through RL, highlighting their potential as a step towards learnable digital twins.
Abstract:This paper introduces a task-specific, model-agnostic framework for evaluating dataset similarity, providing a means to assess and compare dataset realism and quality. Such a framework is crucial for augmenting real-world data, improving benchmarking, and making informed retraining decisions when adapting to new deployment settings, such as different sites or frequency bands. The proposed framework is employed to design metrics based on UMAP topology-preserving dimensionality reduction, leveraging Wasserstein and Euclidean distances on latent space KNN clusters. The designed metrics show correlations above 0.85 between dataset distances and model performances on a channel state information compression unsupervised machine learning task leveraging autoencoder architectures. The results show that the designed metrics outperform traditional methods.




Abstract:This paper presents the Large Wireless Model (LWM) -- the world's first foundation model for wireless channels. Designed as a task-agnostic model, LWM generates universal, rich, contextualized channel embeddings (features) that potentially enhance performance across a wide range of downstream tasks in wireless communication and sensing systems. Towards this objective, LWM, which has a transformer-based architecture, was pre-trained in a self-supervised manner on large-scale wireless channel datasets. Our results show consistent improvements in classification and regression tasks when using the LWM embeddings compared to raw channel representations, especially in scenarios with high-complexity machine learning tasks and limited training datasets. This LWM's ability to learn from large-scale wireless data opens a promising direction for intelligent systems that can efficiently adapt to diverse tasks with limited data, paving the way for addressing key challenges in wireless communication and sensing systems.
Abstract:Reconfigurable intelligent surfaces (RISs) are envisioned to play a key role in future wireless communication networks. However, channel estimation in RIS-aided wireless networks is challenging due to their passive nature and the large number of reflective elements, leading to high channel estimation overhead. Additionally, conventional methods like beam sweeping, which do not rely on explicit channel state information, often struggle in managing interference in multi-user networks. In this paper, we propose a novel approach that leverages digital twins (DTs) of the physical environments to approximate channels using electromagnetic 3D models and ray tracing, thus relaxing the need for channel estimation and extensive over-the-air computations in RIS-aided wireless networks. To address the digital twins channel approximation errors, we further refine this approach with a DT-specific robust transmission design that reliably meets minimum desired rates. The results show that our method secures these rates over 90% of the time, significantly outperforming beam sweeping, which achieves these rates less than 8% of the time due to its poor management of transmitting power and interference.