Information School Capital University of Economics and Business, China
Abstract:Semantic communication focuses on transmitting the meaning of data, aiming for efficient, relevant communication, while non-orthogonal multiple access (NOMA) enhances spectral efficiency by allowing multiple users to share the same spectrum. Integrating semantic users into a NOMA network with bit-based users improves both transmission and spectrum efficiency. However, the performance metric for semantic communication differs significantly from that of traditional communication, posing challenges in simultaneously meeting individual user demands and minimizing transmission power, especially in scenarios with coexisting semantic and bit-based users. Furthermore, the different hardware architectures of semantic and bit-based users complicate the implementation of successive interference cancellation (SIC). To address these challenges, in this paper, we propose a clustered framework to mitigate the complexity of SIC and two multiple access (MA) schemes, e.g., pure cluster-based NOMA (P-CNOMA) and hybrid cluster-based NOMA (H-CNOMA), to minimize the total transmission power. The P-CNOMA scheme can achieve the minimum transmission power, but may not satisfy the high quality of service (QoS) requirement. In contrast, H-CNOMA addresses these issues with a slight increase in power and a reduced semantic rate. These two schemes complement each other, enabling an adaptive MA selection mechanism that adapts to specific network conditions and user requirements.
Abstract:Retrieval-augmented generation (RAG) improves the service quality of large language models by retrieving relevant documents from credible literature and integrating them into the context of the user query. Recently, the rise of the cloud RAG service has made it possible for users to query relevant documents conveniently. However, directly sending queries to the cloud brings potential privacy leakage. In this paper, we are the first to formally define the privacy-preserving cloud RAG service to protect the user query and propose RemoteRAG as a solution regarding privacy, efficiency, and accuracy. For privacy, we introduce $(n,\epsilon)$-DistanceDP to characterize privacy leakage of the user query and the leakage inferred from relevant documents. For efficiency, we limit the search range from the total documents to a small number of selected documents related to a perturbed embedding generated from $(n,\epsilon)$-DistanceDP, so that computation and communication costs required for privacy protection significantly decrease. For accuracy, we ensure that the small range includes target documents related to the user query with detailed theoretical analysis. Experimental results also demonstrate that RemoteRAG can resist existing embedding inversion attack methods while achieving no loss in retrieval under various settings. Moreover, RemoteRAG is efficient, incurring only $0.67$ seconds and $46.66$KB of data transmission ($2.72$ hours and $1.43$ GB with the non-optimized privacy-preserving scheme) when retrieving from a total of $10^6$ documents.
Abstract:As intellectual property rights, the copyright protection of deep models is becoming increasingly important. Existing work has made many attempts at model watermarking and fingerprinting, but they have ignored homologous models trained with similar structures or training datasets. We highlight challenges in efficiently querying black-box piracy models to protect model copyrights without misidentifying homologous models. To address these challenges, we propose a novel method called DeepCore, which discovers that the classification confidence of the model is positively correlated with the distance of the predicted sample from the model decision boundary and piracy models behave more similarly at high-confidence classified sample points. Then DeepCore constructs core points far away from the decision boundary by optimizing the predicted confidence of a few sample points and leverages behavioral discrepancies between piracy and homologous models to identify piracy models. Finally, we design different model identification methods, including two similarity-based methods and a clustering-based method to identify piracy models using models' predictions of core points. Extensive experiments show the effectiveness of DeepCore in identifying various piracy models, achieving lower missed and false identification rates, and outperforming state-of-the-art methods.
Abstract:Autoformalization is the task of automatically translating mathematical content written in natural language to a formal language expression. The growing language interpretation capabilities of Large Language Models (LLMs), including in formal languages, are lowering the barriers for autoformalization. However, LLMs alone are not capable of consistently and reliably delivering autoformalization, in particular as the complexity and specialization of the target domain grows. As the field evolves into the direction of systematically applying autoformalization towards large mathematical libraries, the need to improve syntactic, terminological and semantic control increases. This paper proposes the coordinated use of three mechanisms, most-similar retrieval augmented generation (MS-RAG), denoising steps, and auto-correction with syntax error feedback (Auto-SEF) to improve autoformalization quality. The empirical analysis, across different models, demonstrates that these mechanisms can deliver autoformalizaton results which are syntactically, terminologically and semantically more consistent. These mechanisms can be applied across different LLMs and have shown to deliver improve results across different model types.
Abstract:Aligned LLMs are highly secure, capable of recognizing and refusing to answer malicious questions. However, the role of internal parameters in maintaining this security is not well understood, further these models are vulnerable to security degradation when fine-tuned with non-malicious backdoor data or normal data. To address these challenges, our work uncovers the mechanism behind security in aligned LLMs at the parameter level, identifying a small set of contiguous layers in the middle of the model that are crucial for distinguishing malicious queries from normal ones, referred to as "safety layers." We first confirm the existence of these safety layers by analyzing variations in input vectors within the model's internal layers. Additionally, we leverage the over-rejection phenomenon and parameters scaling analysis to precisely locate the safety layers. Building on this understanding, we propose a novel fine-tuning approach, Safely Partial-Parameter Fine-Tuning (SPPFT), that fixes the gradient of the safety layers during fine-tuning to address the security degradation. Our experiments demonstrate that this approach significantly preserves model security while maintaining performance and reducing computational resources compared to full fine-tuning.
Abstract:Jailbreak attacks on Language Model Models (LLMs) entail crafting prompts aimed at exploiting the models to generate malicious content. This paper proposes a new type of jailbreak attacks which shift the attention of the LLM by inserting a prohibited query into a carrier article. The proposed attack leverage the knowledge graph and a composer LLM to automatically generating a carrier article that is similar to the topic of the prohibited query but does not violate LLM's safeguards. By inserting the malicious query to the carrier article, the assembled attack payload can successfully jailbreak LLM. To evaluate the effectiveness of our method, we leverage 4 popular categories of ``harmful behaviors'' adopted by related researches to attack 6 popular LLMs. Our experiment results show that the proposed attacking method can successfully jailbreak all the target LLMs which high success rate, except for Claude-3.
Abstract:Semantic communication (SemCom) has been a transformative paradigm, emphasizing the precise exchange of meaningful information over traditional bit-level transmissions. However, existing SemCom research, primarily centered on simplified scenarios like single-pair transmissions with direct wireless links, faces significant challenges when applied to real-world radio access networks (RANs). This article introduces a Semantic-aware Radio Access Network (S-RAN), offering a holistic systematic view of SemCom beyond single-pair transmissions. We begin by outlining the S-RAN architecture, introducing new physical components and logical functions along with key design challenges. We then present transceiver design for end-to-end transmission to overcome conventional SemCom transceiver limitations, including static channel conditions, oversimplified background knowledge models, and hardware constraints. Later, we delve into the discussion on radio resource management for multiple users, covering semantic channel modeling, performance metrics, resource management algorithms, and a case study, to elaborate distinctions from resource management for legacy RANs. Finally, we highlight open research challenges and potential solutions. The objective of this article is to serve as a basis for advancing SemCom research into practical wireless systems.
Abstract:In medical image segmentation tasks, diffusion models have shown significant potential. However, mainstream diffusion models suffer from drawbacks such as multiple sampling times and slow prediction results. Recently, consistency models, as a standalone generative network, have resolved this issue. Compared to diffusion models, consistency models can reduce the sampling times to once, not only achieving similar generative effects but also significantly speeding up training and prediction. However, they are not suitable for image segmentation tasks, and their application in the medical imaging field has not yet been explored. Therefore, this paper applies the consistency model to medical image segmentation tasks, designing multi-scale feature signal supervision modes and loss function guidance to achieve model convergence. Experiments have verified that the CTS model can obtain better medical image segmentation results with a single sampling during the test phase.
Abstract:3D object detection plays an important role in autonomous driving; however, its vulnerability to backdoor attacks has become evident. By injecting ''triggers'' to poison the training dataset, backdoor attacks manipulate the detector's prediction for inputs containing these triggers. Existing backdoor attacks against 3D object detection primarily poison 3D LiDAR signals, where large-sized 3D triggers are injected to ensure their visibility within the sparse 3D space, rendering them easy to detect and impractical in real-world scenarios. In this paper, we delve into the robustness of 3D object detection, exploring a new backdoor attack surface through 2D cameras. Given the prevalent adoption of camera and LiDAR signal fusion for high-fidelity 3D perception, we investigate the latent potential of camera signals to disrupt the process. Although the dense nature of camera signals enables the use of nearly imperceptible small-sized triggers to mislead 2D object detection, realizing 2D-oriented backdoor attacks against 3D object detection is non-trivial. The primary challenge emerges from the fusion process that transforms camera signals into a 3D space, compromising the association with the 2D trigger to the target output. To tackle this issue, we propose an innovative 2D-oriented backdoor attack against LiDAR-camera fusion methods for 3D object detection, named BadFusion, for preserving trigger effectiveness throughout the entire fusion process. The evaluation demonstrates the effectiveness of BadFusion, achieving a significantly higher attack success rate compared to existing 2D-oriented attacks.
Abstract:Task-oriented semantic communications (TSC) enhance radio resource efficiency by transmitting task-relevant semantic information. However, current research often overlooks the inherent semantic distinctions among encoded features. Due to unavoidable channel variations from time and frequency-selective fading, semantically sensitive feature units could be more susceptible to erroneous inference if corrupted by dynamic channels. Therefore, this letter introduces a unified channel-resilient TSC framework via information bottleneck. This framework complements existing TSC approaches by controlling information flow to capture fine-grained feature-level semantic robustness. Experiments on a case study for real-time subchannel allocation validate the framework's effectiveness.