Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jianbo Li

GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis

May 24, 2025

Yi Jiang, Sendong Zhao, Jianbo Li, Haochun Wang, Bing Qin

Abstract:The Retrieval-Augmented Generation (RAG) framework introduces a retrieval module to dynamically inject retrieved information into the input context of large language models (LLMs), and has demonstrated significant success in various NLP tasks. However, the current study points out that there is a preference gap between retrievers and LLMs in the RAG framework, which limit the further improvement of system performance. Some highly relevant passages may interfere with LLM reasoning because they contain complex or contradictory information; while some indirectly related or even inaccurate content may help LLM generate more accurate answers by providing suggestive information or logical clues. To solve this, we propose GainRAG, a novel approach that aligns the retriever's and LLM's preferences by defining a new metric, "gain", which measure how well an input passage contributes to correct outputs. Specifically, we propose a method to estimate these gain signals and train a middleware that aligns the preferences of the retriever and the LLM using only limited data. In addition, we introduce a pseudo-passage strategy to mitigate degradation. The experimental results on 6 datasets verify the effectiveness of GainRAG.

* Accepted by ACL 2025

Via

Access Paper or Ask Questions

Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought

May 21, 2025

Ao Liu, Botong Zhou, Can Xu, Chayse Zhou, ChenChen Zhang, Chengcheng Xu, Chenhao Wang, Decheng Wu, Dengpeng Wu, Dian Jiao(+239 more)

Abstract:As Large Language Models (LLMs) rapidly advance, we introduce Hunyuan-TurboS, a novel large hybrid Transformer-Mamba Mixture of Experts (MoE) model. It synergistically combines Mamba's long-sequence processing efficiency with Transformer's superior contextual understanding. Hunyuan-TurboS features an adaptive long-short chain-of-thought (CoT) mechanism, dynamically switching between rapid responses for simple queries and deep "thinking" modes for complex problems, optimizing computational resources. Architecturally, this 56B activated (560B total) parameter model employs 128 layers (Mamba2, Attention, FFN) with an innovative AMF/MF block pattern. Faster Mamba2 ensures linear complexity, Grouped-Query Attention minimizes KV cache, and FFNs use an MoE structure. Pre-trained on 16T high-quality tokens, it supports a 256K context length and is the first industry-deployed large-scale Mamba model. Our comprehensive post-training strategy enhances capabilities via Supervised Fine-Tuning (3M instructions), a novel Adaptive Long-short CoT Fusion method, Multi-round Deliberation Learning for iterative improvement, and a two-stage Large-scale Reinforcement Learning process targeting STEM and general instruction-following. Evaluations show strong performance: overall top 7 rank on LMSYS Chatbot Arena with a score of 1356, outperforming leading models like Gemini-2.0-Flash-001 (1352) and o4-mini-2025-04-16 (1345). TurboS also achieves an average of 77.9% across 23 automated benchmarks. Hunyuan-TurboS balances high performance and efficiency, offering substantial capabilities at lower inference costs than many reasoning models, establishing a new paradigm for efficient large-scale pre-trained models.

Via

Access Paper or Ask Questions

Interpretable modulated differentiable STFT and physics-informed balanced spectrum metric for freight train wheelset bearing cross-machine transfer fault diagnosis under speed fluctuations

Jun 17, 2024

Chao He, Hongmei Shi, Ruixin Li, Jianbo Li, ZuJun Yu

Abstract:The service conditions of wheelset bearings has a direct impact on the safe operation of railway heavy haul freight trains as the key components. However, speed fluctuation of the trains and few fault samples are the two main problems that restrict the accuracy of bearing fault diagnosis. Therefore, a cross-machine transfer diagnosis (pyDSN) network coupled with interpretable modulated differentiable short-time Fourier transform (STFT) and physics-informed balanced spectrum quality metric is proposed to learn domain-invariant and discriminative features under time-varying speeds. Firstly, due to insufficiency in extracting extract frequency components of time-varying speed signals using fixed windows, a modulated differentiable STFT (MDSTFT) that is interpretable with STFT-informed theoretical support, is proposed to extract the robust time-frequency spectrum (TFS). During training process, multiple windows with different lengths dynamically change. Also, in addition to the classification metric and domain discrepancy metric, we creatively introduce a third kind of metric, referred to as the physics-informed metric, to enhance transferable TFS. A physics-informed balanced spectrum quality (BSQ) regularization loss is devised to guide an optimization direction for MDSTFT and model. With it, not only can model acquire high-quality TFS, but also a physics-restricted domain adaptation network can be also acquired, making it learn real-world physics knowledge, ultimately diminish the domain discrepancy across different datasets. The experiment is conducted in the scenario of migrating from the laboratory datasets to the freight train dataset, indicating that the hybrid-driven pyDSN outperforms existing methods and has practical value.

* Advanced Engineering Informatics, 2024

Via

Access Paper or Ask Questions

TASER: Temporal Adaptive Sampling for Fast and Accurate Dynamic Graph Representation Learning

Feb 18, 2024

Gangda Deng, Hongkuan Zhou, Hanqing Zeng, Yinglong Xia, Christopher Leung, Jianbo Li, Rajgopal Kannan, Viktor Prasanna

Abstract:Recently, Temporal Graph Neural Networks (TGNNs) have demonstrated state-of-the-art performance in various high-impact applications, including fraud detection and content recommendation. Despite the success of TGNNs, they are prone to the prevalent noise found in real-world dynamic graphs like time-deprecated links and skewed interaction distribution. The noise causes two critical issues that significantly compromise the accuracy of TGNNs: (1) models are supervised by inferior interactions, and (2) noisy input induces high variance in the aggregated messages. However, current TGNN denoising techniques do not consider the diverse and dynamic noise pattern of each node. In addition, they also suffer from the excessive mini-batch generation overheads caused by traversing more neighbors. We believe the remedy for fast and accurate TGNNs lies in temporal adaptive sampling. In this work, we propose TASER, the first adaptive sampling method for TGNNs optimized for accuracy, efficiency, and scalability. TASER adapts its mini-batch selection based on training dynamics and temporal neighbor selection based on the contextual, structural, and temporal properties of past interactions. To alleviate the bottleneck in mini-batch generation, TASER implements a pure GPU-based temporal neighbor finder and a dedicated GPU feature cache. We evaluate the performance of TASER using two state-of-the-art backbone TGNNs. On five popular datasets, TASER outperforms the corresponding baselines by an average of 2.3% in Mean Reciprocal Rank (MRR) while achieving an average of 5.1x speedup in training time.

* IPDPS 2024

Via

Access Paper or Ask Questions

Object Detection for Caries or Pit and Fissure Sealing Requirement in Children's First Permanent Molars

Aug 31, 2023

Chenyao Jiang, Shiyao Zhai, Hengrui Song, Yuqing Ma, Yachen Fan, Yancheng Fang, Dongmei Yu, Canyang Zhang, Sanyang Han, Runming Wang(+3 more)

Abstract:Dental caries is one of the most common oral diseases that, if left untreated, can lead to a variety of oral problems. It mainly occurs inside the pits and fissures on the occlusal/buccal/palatal surfaces of molars and children are a high-risk group for pit and fissure caries in permanent molars. Pit and fissure sealing is one of the most effective methods that is widely used in prevention of pit and fissure caries. However, current detection of pits and fissures or caries depends primarily on the experienced dentists, which ordinary parents do not have, and children may miss the remedial treatment without timely detection. To address this issue, we present a method to autodetect caries and pit and fissure sealing requirements using oral photos taken by smartphones. We use the YOLOv5 and YOLOX models and adopt a tiling strategy to reduce information loss during image pre-processing. The best result for YOLOXs model with tiling strategy is 72.3 mAP.5, while the best result without tiling strategy is 71.2. YOLOv5s6 model with/without tiling attains 70.9/67.9 mAP.5, respectively. We deploy the pre-trained network to mobile devices as a WeChat applet, allowing in-home detection by parents or children guardian.

Via

Access Paper or Ask Questions

Backscatter Sensors Communication for 6G Low-powered NOMA-enabled IoT Networks under Imperfect SIC

Sep 26, 2021

Manzoor Ahmed, Wali Ullah Khan, Asim Ihsan, Xingwang Li, Jianbo Li, Theodoros A. Tsiftsis

Figure 1 for Backscatter Sensors Communication for 6G Low-powered NOMA-enabled IoT Networks under Imperfect SIC

Figure 2 for Backscatter Sensors Communication for 6G Low-powered NOMA-enabled IoT Networks under Imperfect SIC

Figure 3 for Backscatter Sensors Communication for 6G Low-powered NOMA-enabled IoT Networks under Imperfect SIC

Figure 4 for Backscatter Sensors Communication for 6G Low-powered NOMA-enabled IoT Networks under Imperfect SIC

Abstract:The combination of non-orthogonal multiple access (NOMA) using power-domain with backscatter sensor communication (BSC) is expected to connect a large-scale Internet of things (IoT) devices in future sixth-generation (6G) era. In this paper, we introduce a BSC in multi-cell IoT network, where a source in each cell transmits superimposed signal to its associated IoT devices using NOMA. The backscatter sensor tag (BST) also transmit data towards IoT devices by reflecting and modulating the superimposed signal of the source. A new optimization framework is provided that simultaneously optimizes the total power of each source, power allocation coefficient of IoT devices and reflection coefficient of BST under imperfect successive interference cancellation decoding. The objective of this work is to maximize the total energy efficiency of IoT network subject to quality of services of each IoT device. The problem is first transformed using the Dinkelbach method and then decoupled into two subproblems. The Karush-Kuhn-Tucker conditions and Lagrangian dual method are employed to obtain the efficient solutions. In addition, we also present the conventional NOMA network without BSC as a benchmark framework. Simulation results unveil the advantage of our considered NOMA BSC networks over the conventional NOMA network.

* 11 pages, 6 figures

Via

Access Paper or Ask Questions

Multimodal Gait Recognition for Neurodegenerative Diseases

Jan 07, 2021

Aite Zhao, Jianbo Li, Junyu Dong, Lin Qi, Qianni Zhang, Ning Li, Xin Wang, Huiyu Zhou

Figure 1 for Multimodal Gait Recognition for Neurodegenerative Diseases

Figure 2 for Multimodal Gait Recognition for Neurodegenerative Diseases

Figure 3 for Multimodal Gait Recognition for Neurodegenerative Diseases

Figure 4 for Multimodal Gait Recognition for Neurodegenerative Diseases

Abstract:In recent years, single modality based gait recognition has been extensively explored in the analysis of medical images or other sensory data, and it is recognised that each of the established approaches has different strengths and weaknesses. As an important motor symptom, gait disturbance is usually used for diagnosis and evaluation of diseases; moreover, the use of multi-modality analysis of the patient's walking pattern compensates for the one-sidedness of single modality gait recognition methods that only learn gait changes in a single measurement dimension. The fusion of multiple measurement resources has demonstrated promising performance in the identification of gait patterns associated with individual diseases. In this paper, as a useful tool, we propose a novel hybrid model to learn the gait differences between three neurodegenerative diseases, between patients with different severity levels of Parkinson's disease and between healthy individuals and patients, by fusing and aggregating data from multiple sensors. A spatial feature extractor (SFE) is applied to generating representative features of images or signals. In order to capture temporal information from the two modality data, a new correlative memory neural network (CorrMNN) architecture is designed for extracting temporal features. Afterwards, we embed a multi-switch discriminator to associate the observations with individual state estimations. Compared with several state-of-the-art techniques, our proposed framework shows more accurate classification results.

Via

Access Paper or Ask Questions

Associated Spatio-Temporal Capsule Network for Gait Recognition

Jan 07, 2021

Aite Zhao, Junyu Dong, Jianbo Li, Lin Qi, Huiyu Zhou

Figure 1 for Associated Spatio-Temporal Capsule Network for Gait Recognition

Figure 2 for Associated Spatio-Temporal Capsule Network for Gait Recognition

Figure 3 for Associated Spatio-Temporal Capsule Network for Gait Recognition

Figure 4 for Associated Spatio-Temporal Capsule Network for Gait Recognition

Abstract:It is a challenging task to identify a person based on her/his gait patterns. State-of-the-art approaches rely on the analysis of temporal or spatial characteristics of gait, and gait recognition is usually performed on single modality data (such as images, skeleton joint coordinates, or force signals). Evidence has shown that using multi-modality data is more conducive to gait research. Therefore, we here establish an automated learning system, with an associated spatio-temporal capsule network (ASTCapsNet) trained on multi-sensor datasets, to analyze multimodal information for gait recognition. Specifically, we first design a low-level feature extractor and a high-level feature extractor for spatio-temporal feature extraction of gait with a novel recurrent memory unit and a relationship layer. Subsequently, a Bayesian model is employed for the decision-making of class labels. Extensive experiments on several public datasets (normal and abnormal gait) validate the effectiveness of the proposed ASTCapsNet, compared against several state-of-the-art methods.

Via

Access Paper or Ask Questions

Collaborative City Digital Twin For Covid-19 Pandemic: A Federated Learning Solution

Nov 05, 2020

Junjie Pang, Jianbo Li, Zhenzhen Xie, Yan Huang, Zhipeng Cai

Figure 1 for Collaborative City Digital Twin For Covid-19 Pandemic: A Federated Learning Solution

Figure 2 for Collaborative City Digital Twin For Covid-19 Pandemic: A Federated Learning Solution

Figure 3 for Collaborative City Digital Twin For Covid-19 Pandemic: A Federated Learning Solution

Figure 4 for Collaborative City Digital Twin For Covid-19 Pandemic: A Federated Learning Solution

Abstract:In this work, we propose a collaborative city digital twin based on FL, a novel paradigm that allowing multiple city DT to share the local strategy and status in a timely manner. In particular, an FL central server manages the local updates of multiple collaborators (city DT), provides a global model which is trained in multiple iterations at different city DT systems, until the model gains the correlations between various response plan and infection trend. That means, a collaborative city DT paradigm based on FL techniques can obtain knowledge and patterns from multiple DTs, and eventually establish a `global view' for city crisis management. Meanwhile, it also helps to improve each city digital twin selves by consolidating other DT's respective data without violating privacy rules. To validate the proposed solution, we take COVID-19 pandemic as a case study. The experimental results on the real dataset with various response plan validate our proposed solution and demonstrate the superior performance.

* 8 pages

Via

Access Paper or Ask Questions

Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization

Jul 20, 2020

Haibao Yu, Qi Han, Jianbo Li, Jianping Shi, Guangliang Cheng, Bin Fan

Figure 1 for Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization

Figure 2 for Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization

Figure 3 for Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization

Figure 4 for Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization

Abstract:Emergent hardwares can support mixed precision CNN models inference that assign different bitwidths for different layers. Learning to find an optimal mixed precision model that can preserve accuracy and satisfy the specific constraints on model size and computation is extremely challenge due to the difficult in training a mixed precision model and the huge space of all possible bit quantizations. In this paper, we propose a novel soft Barrier Penalty based NAS (BP-NAS) for mixed precision quantization, which ensures all the searched models are inside the valid domain defined by the complexity constraint, thus could return an optimal model under the given constraint by conducting search only one time. The proposed soft Barrier Penalty is differentiable and can impose very large losses to those models outside the valid domain while almost no punishment for models inside the valid domain, thus constraining the search only in the feasible domain. In addition, a differentiable Prob-1 regularizer is proposed to ensure learning with NAS is reasonable. A distribution reshaping training strategy is also used to make training more stable. BP-NAS sets new state of the arts on both classification (Cifar-10, ImageNet) and detection (COCO), surpassing all the efficient mixed precision methods designed manually and automatically. Particularly, BP-NAS achieves higher mAP (up to 2.7\% mAP improvement) together with lower bit computation cost compared with the existing best mixed precision model on COCO detection.

* ECCV2020

Via

Access Paper or Ask Questions