Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xicheng Zhang

RAG+: Enhancing Retrieval-Augmented Generation with Application-Aware Reasoning

Jun 13, 2025

Yu Wang, Shiwan Zhao, Ming Fan, Zhihu Wang, Yubo Zhang, Xicheng Zhang, Zhengfan Wang, Heyuan Huang, Ting Liu

Abstract:The integration of external knowledge through Retrieval-Augmented Generation (RAG) has become foundational in enhancing large language models (LLMs) for knowledge-intensive tasks. However, existing RAG paradigms often overlook the cognitive step of applying knowledge, leaving a gap between retrieved facts and task-specific reasoning. In this work, we introduce RAG+, a principled and modular extension that explicitly incorporates application-aware reasoning into the RAG pipeline. RAG+ constructs a dual corpus consisting of knowledge and aligned application examples, created either manually or automatically, and retrieves both jointly during inference. This design enables LLMs not only to access relevant information but also to apply it within structured, goal-oriented reasoning processes. Experiments across mathematical, legal, and medical domains, conducted on multiple models, demonstrate that RAG+ consistently outperforms standard RAG variants, achieving average improvements of 3-5%, and peak gains up to 7.5% in complex scenarios. By bridging retrieval with actionable application, RAG+ advances a more cognitively grounded framework for knowledge integration, representing a step toward more interpretable and capable LLMs.

Via

Access Paper or Ask Questions

Stealthy Backdoor Attack to Real-world Models in Android Apps

Jan 02, 2025

Jiali Wei, Ming Fan, Xicheng Zhang, Wenjing Jiao, Haijun Wang, Ting Liu

Figure 1 for Stealthy Backdoor Attack to Real-world Models in Android Apps

Figure 2 for Stealthy Backdoor Attack to Real-world Models in Android Apps

Figure 3 for Stealthy Backdoor Attack to Real-world Models in Android Apps

Figure 4 for Stealthy Backdoor Attack to Real-world Models in Android Apps

Abstract:Powered by their superior performance, deep neural networks (DNNs) have found widespread applications across various domains. Many deep learning (DL) models are now embedded in mobile apps, making them more accessible to end users through on-device DL. However, deploying on-device DL to users' smartphones simultaneously introduces several security threats. One primary threat is backdoor attacks. Extensive research has explored backdoor attacks for several years and has proposed numerous attack approaches. However, few studies have investigated backdoor attacks on DL models deployed in the real world, or they have shown obvious deficiencies in effectiveness and stealthiness. In this work, we explore more effective and stealthy backdoor attacks on real-world DL models extracted from mobile apps. Our main justification is that imperceptible and sample-specific backdoor triggers generated by DNN-based steganography can enhance the efficacy of backdoor attacks on real-world models. We first confirm the effectiveness of steganography-based backdoor attacks on four state-of-the-art DNN models. Subsequently, we systematically evaluate and analyze the stealthiness of the attacks to ensure they are difficult to perceive. Finally, we implement the backdoor attacks on real-world models and compare our approach with three baseline methods. We collect 38,387 mobile apps, extract 89 DL models from them, and analyze these models to obtain the prerequisite model information for the attacks. After identifying the target models, our approach achieves an average of 12.50% higher attack success rate than DeepPayload while better maintaining the normal performance of the models. Extensive experimental results demonstrate that our method enables more effective, robust, and stealthy backdoor attacks on real-world models.

Via

Access Paper or Ask Questions

LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library

Aug 12, 2024

Tianhao Yu, Cai Yao, Zhuorui Sun, Feng Shi, Lin Zhang, Kangjie Lyu, Xuan Bai, Andong Liu, Xicheng Zhang, Jiali Zou(+3 more)

Figure 1 for LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library

Figure 2 for LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library

Figure 3 for LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library

Figure 4 for LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library

Abstract:In this study, we generate and maintain a database of 10 million virtual lipids through METiS's in-house de novo lipid generation algorithms and lipid virtual screening techniques. These virtual lipids serve as a corpus for pre-training, lipid representation learning, and downstream task knowledge transfer, culminating in state-of-the-art LNP property prediction performance. We propose LipidBERT, a BERT-like model pre-trained with the Masked Language Model (MLM) and various secondary tasks. Additionally, we compare the performance of embeddings generated by LipidBERT and PhatGPT, our GPT-like lipid generation model, on downstream tasks. The proposed bilingual LipidBERT model operates in two languages: the language of ionizable lipid pre-training, using in-house dry-lab lipid structures, and the language of LNP fine-tuning, utilizing in-house LNP wet-lab data. This dual capability positions LipidBERT as a key AI-based filter for future screening tasks, including new versions of METiS de novo lipid libraries and, more importantly, candidates for in vivo testing for orgran-targeting LNPs. To the best of our knowledge, this is the first successful demonstration of the capability of a pre-trained language model on virtual lipids and its effectiveness in downstream tasks using web-lab data. This work showcases the clever utilization of METiS's in-house de novo lipid library as well as the power of dry-wet lab integration.

Via

Access Paper or Ask Questions

New algorithms for sampling and diffusion models

Jun 14, 2024

Xicheng Zhang

Abstract:Drawing from the theory of stochastic differential equations, we introduce a novel sampling method for known distributions and a new algorithm for diffusion generative models with unknown distributions. Our approach is inspired by the concept of the reverse diffusion process, widely adopted in diffusion generative models. Additionally, we derive the explicit convergence rate based on the smooth ODE flow. For diffusion generative models and sampling, we establish a {\it dimension-free} particle approximation convergence result. Numerical experiments demonstrate the effectiveness of our method. Notably, unlike the traditional Langevin method, our sampling method does not require any regularity assumptions about the density function of the target distribution. Furthermore, we also apply our method to optimization problems.

* 24pages

Via

Access Paper or Ask Questions