Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Changqing Li

MASteer: Multi-Agent Adaptive Steer Strategy for End-to-End LLM Trustworthiness Repair

Aug 09, 2025

Changqing Li, Tianlin Li, Xiaohan Zhang, Aishan Liu, Li Pan

Abstract:Large Language Models (LLMs) face persistent and evolving trustworthiness issues, motivating developers to seek automated and flexible repair methods that enable convenient deployment across diverse scenarios. Existing repair methods like supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) are costly and slow, while prompt engineering lacks robustness and scalability. Representation engineering, which steers model behavior by injecting targeted concept vectors during inference, offers a lightweight, training-free alternative. However, current approaches depend on manually crafted samples and fixed steering strategies, limiting automation and adaptability. To overcome these challenges, we propose MASteer, the first end-to-end framework for trustworthiness repair in LLMs based on representation engineering. MASteer integrates two core components: AutoTester, a multi-agent system that generates diverse, high-quality steer samples tailored to developer needs; and AutoRepairer, which constructs adaptive steering strategies with anchor vectors for automated, context-aware strategy selection during inference. Experiments on standard and customized trustworthiness tasks show MASteer consistently outperforms baselines, improving metrics by 15.36% on LLaMA-3.1-8B-Chat and 4.21% on Qwen-3-8B-Chat, while maintaining general model capabilities. MASteer demonstrates strong robustness, generalization, and practical value for scalable, efficient trustworthiness repair.

Via

Access Paper or Ask Questions

Inference Performance Optimization for Large Language Models on CPUs

Jul 10, 2024

Pujiang He, Shan Zhou, Wenhuan Huang, Changqing Li, Duyi Wang, Bin Guo, Chen Meng, Sheng Gui, Weifei Yu, Yi Xie

Figure 1 for Inference Performance Optimization for Large Language Models on CPUs

Figure 2 for Inference Performance Optimization for Large Language Models on CPUs

Figure 3 for Inference Performance Optimization for Large Language Models on CPUs

Figure 4 for Inference Performance Optimization for Large Language Models on CPUs

Abstract:Large language models (LLMs) have shown exceptional performance and vast potential across diverse tasks. However, the deployment of LLMs with high performance in low-resource environments has garnered significant attention in the industry. When GPU hardware resources are limited, we can explore alternative options on CPUs. To mitigate the financial burden and alleviate constraints imposed by hardware resources, optimizing inference performance is necessary. In this paper, we introduce an easily deployable inference performance optimization solution aimed at accelerating LLMs on CPUs. In this solution, we implement an effective way to reduce the KV cache size while ensuring precision. We propose a distributed inference optimization approach and implement it based on oneAPI Collective Communications Library. Furthermore, we propose optimization approaches for LLMs on CPU, and conduct tailored optimizations for the most commonly used models. The code is open-sourced at https://github.com/intel/xFasterTransformer.

* 5 pages, 6 figure, ICML 2024 on Foundation Models in the Wild

Via

Access Paper or Ask Questions

Development of Focused X-ray Luminescence Compute Tomography Imaging

Jun 11, 2024

Yile Fang, Yibing Zhang, Changqing Li

Figure 1 for Development of Focused X-ray Luminescence Compute Tomography Imaging

Figure 2 for Development of Focused X-ray Luminescence Compute Tomography Imaging

Figure 3 for Development of Focused X-ray Luminescence Compute Tomography Imaging

Figure 4 for Development of Focused X-ray Luminescence Compute Tomography Imaging

Abstract:X-ray luminescence is produced when contrast agents absorb energy from X-ray photons and release a portion of that energy by emitting photons in the visible and near-infrared range. X-ray luminescence computed tomography (XLCT) was introduced in the past decade as a hybrid molecular imaging modality combining the merits of both X-ray imaging (high spatial resolution) and optical imaging (high sensitivity to tracer nanophosphors).

Via

Access Paper or Ask Questions