Abstract:In this paper, we propose a cross-layer encrypted semantic communication (CLESC) framework for panoramic video transmission, incorporating feature extraction, encoding, encryption, cyclic redundancy check (CRC), and retransmission processes to achieve compatibility between semantic communication and traditional communication systems. Additionally, we propose an adaptive cross-layer transmission mechanism that dynamically adjusts CRC, channel coding, and retransmission schemes based on the importance of semantic information. This ensures that important information is prioritized under poor transmission conditions. To verify the aforementioned framework, we also design an end-to-end adaptive panoramic video semantic transmission (APVST) network that leverages a deep joint source-channel coding (Deep JSCC) structure and attention mechanism, integrated with a latitude adaptive module that facilitates adaptive semantic feature extraction and variable-length encoding of panoramic videos. The proposed CLESC is also applicable to the transmission of other modal data. Simulation results demonstrate that the proposed CLESC effectively achieves compatibility and adaptation between semantic communication and traditional communication systems, improving both transmission efficiency and channel adaptability. Compared to traditional cross-layer transmission schemes, the CLESC framework can reduce bandwidth consumption by 85% while showing significant advantages under low signal-to-noise ratio (SNR) conditions.
Abstract:Physical-Layer Authentication (PLA) offers endogenous security, lightweight implementation, and high reliability, making it a promising complement to upper-layer security methods in Edge Intelligence (EI)-empowered Industrial Internet of Things (IIoT). However, state-of-the-art Channel State Information (CSI)-based PLA schemes face challenges in recognizing mobile multi-users due to the limited reliability of CSI fingerprints in low Signal-to-Noise Ratio (SNR) environments and the constantly shifting CSI distributions with user movements. To address these issues, we propose a Temporal Dynamic Graph Convolutional Network (TDGCN)-based PLA scheme. This scheme harnesses Intelligent Reflecting Surfaces (IRSs) to refine CSI fingerprint precision and employs Graph Neural Networks (GNNs) to capture the spatio-temporal dynamics induced by user movements and IRS deployments. Specifically, we partition hierarchical CSI fingerprints into multivariate time series and utilize dynamic GNNs to capture their associations. Additionally, Temporal Convolutional Networks (TCNs) handle temporal dependencies within each CSI fingerprint dimension. Dynamic Graph Isomorphism Networks (GINs) and cascade node clustering pooling further enable efficient information aggregation and reduced computational complexity. Simulations demonstrate the proposed scheme's superior authentication accuracy compared to seven baseline schemes.
Abstract:Integrated sensing and communications (ISAC) as one of the key technologies is capable of supporting high-speed communication and high-precision sensing for the upcoming 6G. This paper studies a waveform strategy by designing the orthogonal frequency division multiplexing (OFDM)-based reference signal (RS) for sensing and communication in ISAC system. We derive the closed-form expressions of Cram\'er-Rao Bound (CRB) for the distance and velocity estimations, and obtain the communication rate under the mean square error of channel estimation. Then, a weighted sum CRB minimization problem on the distance and velocity estimations is formulated by considering communication rate requirement and RS intervals constraints, which is a mixed-integer problem due to the discrete RS interval values. To solve this problem, some numerical methods are typically adopted to obtain the optimal solutions, whose computational complexity grow exponentially with the number of symbols and subcarriers of OFDM. Therefore, we propose a relaxation and approximation method to transform the original discrete problem into a continuous convex one and obtain the sub-optimal solutions. Finally, our proposed scheme is compared with the exhaustive search method in numerical simulations, which show slight gap between the obtained sub-optimal and optimal solutions, and this gap further decreases with large weight factor.
Abstract:Large Language Models (LLMs) have achieved significant success in various natural language processing tasks, but the role of wireless networks in supporting LLMs has not been thoroughly explored. In this paper, we propose a wireless distributed Mixture of Experts (WDMoE) architecture to enable collaborative deployment of LLMs across edge servers at the base station (BS) and mobile devices in wireless networks. Specifically, we decompose the MoE layer in LLMs by placing the gating network and the preceding neural network layer at BS, while distributing the expert networks among the devices. This deployment leverages the parallel inference capabilities of expert networks on mobile devices, effectively utilizing the limited computing and caching resources of these devices. Accordingly, we develop a performance metric for WDMoE-based LLMs, which accounts for both model capability and latency. To minimize the latency while maintaining accuracy, we jointly optimize expert selection and bandwidth allocation based on the performance metric. Moreover, we build a hardware testbed using NVIDIA Jetson kits to validate the effectiveness of WDMoE. Both theoretical simulations and practical hardware experiments demonstrate that the proposed method can significantly reduce the latency without compromising LLM performance.
Abstract:The end-to-end image communication system has been widely studied in the academic community. The escalating demands on image communication systems in terms of data volume, environmental complexity, and task precision require enhanced communication efficiency, anti-noise ability and semantic fidelity. Therefore, we proposed a novel paradigm based on Semantic Feature Decomposition (SeFD) for the integration of semantic communication and large-scale visual generation models to achieve high-performance, highly interpretable and controllable image communication. According to this paradigm, a Texture-Color based Semantic Communication system of Images TCSCI is proposed. TCSCI decomposing the images into their natural language description (text), texture and color semantic features at the transmitter. During the transmission, features are transmitted over the wireless channel, and at the receiver, a large-scale visual generation model is utilized to restore the image through received features. TCSCI can achieve extremely compressed, highly noise-resistant, and visually similar image semantic communication, while ensuring the interpretability and editability of the transmission process. The experiments demonstrate that the TCSCI outperforms traditional image communication systems and existing semantic communication systems under extreme compression with good anti-noise performance and interpretability.
Abstract:The key feature of model-driven semantic communication is the propagation of the model. The semantic model component (SMC) is designed to drive the intelligent model to transmit in the physical channel, allowing the intelligence to flow through the networks. According to the characteristics of neural networks with common and individual model parameters, this paper designs the cross-source-domain and cross-task semantic component model. Considering that the basic model is deployed on the edge node, the large server node updates the edge node by transmitting only the semantic component model to the edge node so that the edge node can handle different sources and different tasks. In addition, this paper also discusses how channel noise affects the performance of the model and proposes methods of injection noise and regularization to improve the noise resistance of the model. Experiments show that SMCs use smaller model parameters to achieve cross-source, cross-task functionality while maintaining performance and improving the model's tolerance to noise. Finally, a component transfer-based unmanned vehicle tracking prototype was implemented to verify the feasibility of model components in practical applications.
Abstract:Lightweight and efficient neural network models for deep joint source-channel coding (JSCC) are crucial for semantic communications. In this paper, we propose a novel JSCC architecture, named MambaJSCC, that achieves state-of-the-art performance with low computational and parameter overhead. MambaJSCC utilizes the visual state space model with channel adaptation (VSSM-CA) blocks as its backbone for transmitting images over wireless channels, where the VSSM-CA primarily consists of the generalized state space models (GSSM) and the zero-parameter, zero-computational channel adaptation method (CSI-ReST). We design the GSSM module, leveraging reversible matrix transformations to express generalized scan expanding operations, and theoretically prove that two GSSM modules can effectively capture global information. We discover that GSSM inherently possesses the ability to adapt to channels, a form of endogenous intelligence. Based on this, we design the CSI-ReST method, which injects channel state information (CSI) into the initial state of GSSM to utilize its native response, and into the residual state to mitigate CSI forgetting, enabling effective channel adaptation without introducing additional computational and parameter overhead. Experimental results show that MambaJSCC not only outperforms existing JSCC methods (e.g., SwinJSCC) across various scenarios but also significantly reduces parameter size, computational overhead, and inference delay.
Abstract:Next-generation wireless networks are expected to develop a novel paradigm of integrated sensing and communications (ISAC) to enable both the high-accuracy sensing and high-speed communications. However, conventional mono-static ISAC systems, which simultaneously transmit and receive at the same equipment, may suffer from severe self-interference, and thus significantly degrade the system performance.To address this issue, this paper studies a multi-static ISAC system for cooperative target localization and communications, where the transmitter transmits ISAC signal to multiple receivers (REs) deployed at different positions. We derive the closed-form Cram\'{e}r-Rao bound (CRB) on the joint estimations of both the transmission delay and Doppler shift for cooperative target localization, and the CRB minimization problem is formulated by considering the cooperative cost and communication rate requirements for the REs. To solve this problem, we first decouple it into two subproblems for RE selection and transmit beamforming, respectively. Then, a minimax linkage-based method is proposed to solve the RE selection subproblem, and a successive convex approximation algorithm is adopted to deal with the transmit beamforming subproblem with non-convex constraints. Finally, numerical results validate our analysis and reveal that our proposed multi-static ISAC scheme achieves better ISAC performance than the conventional mono-static ones when the number of cooperative REs is large.
Abstract:Semantic communication, as a revolutionary communication architecture, is considered a promising novel communication paradigm. Unlike traditional symbol-based error-free communication systems, semantic-based visual communication systems extract, compress, transmit, and reconstruct images at the semantic level. However, widely used image similarity evaluation metrics, whether pixel-based MSE or PSNR or structure-based MS-SSIM, struggle to accurately measure the loss of semantic-level information of the source during system transmission. This presents challenges in evaluating the performance of visual semantic communication systems, especially when comparing them with traditional communication systems. To address this, we propose a semantic evaluation metric -- SeSS (Semantic Similarity Score), based on Scene Graph Generation and graph matching, which shifts the similarity scores between images into semantic-level graph matching scores. Meanwhile, semantic similarity scores for tens of thousands of image pairs are manually annotated to fine-tune the hyperparameters in the graph matching algorithm, aligning the metric more closely with human semantic perception. The performance of the SeSS is tested on different datasets, including (1)images transmitted by traditional and semantic communication systems at different compression rates, (2)images transmitted by traditional and semantic communication systems at different signal-to-noise ratios, (3)images generated by large-scale model with different noise levels introduced, and (4)cases of images subjected to certain special transformations. The experiments demonstrate the effectiveness of SeSS, indicating that the metric can measure the semantic-level differences in semantic-level information of images and can be used for evaluation in visual semantic communication systems.
Abstract:Fine-tuning large-scale pre-trained models via transfer learning is an emerging important paradigm for a wide range of downstream tasks, with performance heavily reliant on extensive data. Federated learning (FL), as a distributed framework, provides a secure solution to train models on local datasets while safeguarding raw sensitive data. However, FL networks encounter high communication costs due to the massive parameters of large-scale pre-trained models, necessitating parameter-efficient methods. Notably, parameter efficient fine tuning, such as Low-Rank Adaptation (LoRA), has shown remarkable success in fine-tuning pre-trained models. However, prior research indicates that the fixed parameter budget may be prone to the overfitting or slower convergence. To address this challenge, we propose a Simulated Annealing-based Federated Learning with LoRA tuning (SA-FedLoRA) approach by reducing trainable parameters. Specifically, SA-FedLoRA comprises two stages: initiating and annealing. (1) In the initiating stage, we implement a parameter regularization approach during the early rounds of aggregation, aiming to mitigate client drift and accelerate the convergence for the subsequent tuning. (2) In the annealing stage, we allocate higher parameter budget during the early 'heating' phase and then gradually shrink the budget until the 'cooling' phase. This strategy not only facilitates convergence to the global optimum but also reduces communication costs. Experimental results demonstrate that SA-FedLoRA is an efficient FL, achieving superior performance to FedAvg and significantly reducing communication parameters by up to 93.62%.