Abstract:Transformer-based large language models exhibit groundbreaking capabilities, but their storage and computational costs are prohibitively high, limiting their application in resource-constrained scenarios. An effective approach is to eliminate redundant model parameters and computational costs while incorporating efficient expert-derived knowledge structures to achieve a balance between compression and performance. Therefore, we propose the \textit{Sememe Entanglement Encoding (SEE)} algorithm. Guided by expert prior knowledge, the model is compressed through the low-rank approximation idea. In Entanglement Embedding, basic semantic units such as sememes are represented as low-dimensional vectors, and then reconstructed into high-dimensional word embeddings through the combination of generalized quantum entanglement. We adapt the Sememe Entanglement Encoding algorithm to transformer-based models of different magnitudes. Experimental results indicate that our approach achieves stable performance while compressing model parameters and computational costs.
Abstract:Recent advances in deepfake forensics have primarily focused on improving the classification accuracy and generalization performance. Despite enormous progress in detection accuracy across a wide variety of forgery algorithms, existing algorithms lack intuitive interpretability and identity traceability to help with forensic investigation. In this paper, we introduce a novel DeepFake Identity Recovery scheme (DFREC) to fill this gap. DFREC aims to recover the pair of source and target faces from a deepfake image to facilitate deepfake identity tracing and reduce the risk of deepfake attack. It comprises three key components: an Identity Segmentation Module (ISM), a Source Identity Reconstruction Module (SIRM), and a Target Identity Reconstruction Module (TIRM). The ISM segments the input face into distinct source and target face information, and the SIRM reconstructs the source face and extracts latent target identity features with the segmented source information. The background context and latent target identity features are synergetically fused by a Masked Autoencoder in the TIRM to reconstruct the target face. We evaluate DFREC on six different high-fidelity face-swapping attacks on FaceForensics++, CelebaMegaFS and FFHQ-E4S datasets, which demonstrate its superior recovery performance over state-of-the-art deepfake recovery algorithms. In addition, DFREC is the only scheme that can recover both pristine source and target faces directly from the forgery image with high fadelity.
Abstract:Implicit neural representations have emerged as a powerful paradigm to represent signals such as images and sounds. This approach aims to utilize neural networks to parameterize the implicit function of the signal. However, when representing implicit functions, traditional neural networks such as ReLU-based multilayer perceptrons face challenges in accurately modeling high-frequency components of signals. Recent research has begun to explore the use of Fourier Neural Networks (FNNs) to overcome this limitation. In this paper, we propose Quantum Implicit Representation Network (QIREN), a novel quantum generalization of FNNs. Furthermore, through theoretical analysis, we demonstrate that QIREN possesses a quantum advantage over classical FNNs. Lastly, we conducted experiments in signal representation, image superresolution, and image generation tasks to show the superior performance of QIREN compared to state-of-the-art (SOTA) models. Our work not only incorporates quantum advantages into implicit neural representations but also uncovers a promising application direction for Quantum Neural Networks.
Abstract:Intelligent task-oriented semantic communications (SemComs) have witnessed great progress with the development of deep learning (DL). In this paper, we propose a semantic-aware hybrid automatic repeat request (SemHARQ) framework for the robust and efficient transmissions of semantic features. First, to improve the robustness and effectiveness of semantic coding, a multi-task semantic encoder is proposed. Meanwhile, a feature importance ranking (FIR) method is investigated to ensure the important features delivery under limited channel resources. Then, to accurately detect the possible transmission errors, a novel feature distortion evaluation (FDE) network is designed to identify the distortion level of each feature, based on which an efficient HARQ method is proposed. Specifically, the corrupted features are retransmitted, where the remaining channel resources are used for incremental transmissions. The system performance is evaluated under different channel conditions in multi-task scenarios in Internet of Vehicles. Extensive experiments show that the proposed framework outperforms state-of-the-art works by more than 20% in rank-1 accuracy for vehicle re-identification, and 10% in vehicle color classification accuracy in the low signal-to-noise ratio regime.
Abstract:Semantic communications are expected to be an innovative solution to the emerging intelligent applications in the era of connected intelligence. In this paper, a novel scalable multitask semantic communication system with feature importance ranking (SMSC-FIR) is explored. Firstly, the multi-task correlations are investigated by a joint semantic encoder to extract relevant features. Then, a new scalable coding method is proposed based on feature importance ranking, which dynamically adjusts the coding rate and guarantees that important features for semantic tasks are transmitted with higher priority. Simulation results show that SMSC-FIR achieves performance gain w.r.t. individual intelligent tasks, especially in the low SNR regime.
Abstract:In online advertising, it is highly important to predict the probability and the value of a conversion (e.g., a purchase). It not only impacts user experience by showing relevant ads, but also affects ROI of advertisers and revenue of marketplaces. Unlike clicks, which often occur within minutes after impressions, conversions are expected to happen over a long period of time (e.g., 30 days for online shopping). It creates a challenge, as the true labels are only available after the long delays. Either inaccurate labels (partial conversions) are used, or models are trained on stale data (e.g., from 30 days ago). The problem is more eminent in online learning, which focuses on the live performance on the latest data. In this paper, a novel solution is presented to address this challenge using multi-head modeling. Unlike traditional methods, it directly quantizes conversions into multiple windows, such as day 1, day 2, day 3-7, and day 8-30. A sub-model is trained specifically on conversions within each window. Label freshness is maximally preserved in early models (e.g., day 1 and day 2), while late conversions are accurately utilized in models with longer delays (e.g., day 8-30). It is shown to greatly exceed the performance of known methods in online learning experiments for both conversion rate (CVR) and value per click (VPC) predictions. Lastly, as a general method for delayed feedback modeling, it can be combined with any advanced ML techniques to further improve the performance.
Abstract:Millimeter wave (mmWave) communications can potentially meet the high data-rate requirements of unmanned aerial vehicle (UAV) networks. However, as the prerequisite of mmWave communications, the narrow directional beam tracking is very challenging because of the three-dimensional (3D) mobility and attitude variation of UAVs. Aiming to address the beam tracking difficulties, we propose to integrate the conformal array (CA) with the surface of each UAV, which enables the full spatial coverage and the agile beam tracking in highly dynamic UAV mmWave networks. More specifically, the key contributions of our work are three-fold. 1) A new mmWave beam tracking framework is established for the CA-enabled UAV mmWave network. 2) A specialized hierarchical codebook is constructed to drive the directional radiating element (DRE)-covered cylindrical conformal array (CCA), which contains both the angular beam pattern and the subarray pattern to fully utilize the potential of the CA. 3) A codebook-based multiuser beam tracking scheme is proposed, where the Gaussian process machine learning enabled UAV position/attitude predication is developed to improve the beam tracking efficiency in conjunction with the tracking-error aware adaptive beamwidth control. Simulation results validate the effectiveness of the proposed codebook-based beam tracking scheme in the CA-enabled UAV mmWave network, and demonstrate the advantages of CA over the conventional planner array in terms of spectrum efficiency and outage probability in the highly dynamic scenarios.