Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tianrui Qin

Search More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and Generalization

Feb 26, 2026

Qianben Chen, Tianrui Qin, King Zhu, Qiexiang Wang, Chengjun Yu, Shu Xu, Jiaqi Wu, Jiayu Zhang, Xinpeng Liu, Xin Gui(+14 more)

Abstract:Recent deep research agents primarily improve performance by scaling reasoning depth, but this leads to high inference cost and latency in search-intensive scenarios. Moreover, generalization across heterogeneous research settings remains challenging. In this work, we propose \emph{Search More, Think Less} (SMTL), a framework for long-horizon agentic search that targets both efficiency and generalization. SMTL replaces sequential reasoning with parallel evidence acquisition, enabling efficient context management under constrained context budgets. To support generalization across task types, we further introduce a unified data synthesis pipeline that constructs search tasks spanning both deterministic question answering and open-ended research scenarios with task appropriate evaluation metrics. We train an end-to-end agent using supervised fine-tuning and reinforcement learning, achieving strong and often state of the art performance across benchmarks including BrowseComp (48.6\%), GAIA (75.7\%), Xbench (82.0\%), and DeepResearch Bench (45.9\%). Compared to Mirothinker-v1.0, SMTL with maximum 100 interaction steps reduces the average number of reasoning steps on BrowseComp by 70.7\%, while improving accuracy.

* 12 pages, 5 figures

Via

Access Paper or Ask Questions

PersonaDual: Balancing Personalization and Objectivity via Adaptive Reasoning

Jan 13, 2026

Xiaoyou Liu, Xinyi Mou, Shengbin Yue, Liang Wang, Yuqing Wang, Qiexiang Wang, Tianrui Qin, Wangchunshu Zhou, Zhongyu Wei

Abstract:As users increasingly expect LLMs to align with their preferences, personalized information becomes valuable. However, personalized information can be a double-edged sword: it can improve interaction but may compromise objectivity and factual correctness, especially when it is misaligned with the question. To alleviate this problem, we propose PersonaDual, a framework that supports both general-purpose objective reasoning and personalized reasoning in a single model, and adaptively switches modes based on context. PersonaDual is first trained with SFT to learn two reasoning patterns, and then further optimized via reinforcement learning with our proposed DualGRPO to improve mode selection. Experiments on objective and personalized benchmarks show that PersonaDual preserves the benefits of personalization while reducing interference, achieving near interference-free performance and better leveraging helpful personalized signals to improve objective problem-solving.

Via

Access Paper or Ask Questions

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

Jul 08, 2025

Xiangru Tang, Tianrui Qin, Tianhao Peng, Ziyang Zhou, Daniel Shao, Tingting Du, Xinming Wei, Peng Xia, Fang Wu, He Zhu(+8 more)

Abstract:As language agents tackle increasingly complex tasks, they struggle with effective error correction and experience reuse across domains. We introduce Agent KB, a hierarchical experience framework that enables complex agentic problem solving via a novel Reason-Retrieve-Refine pipeline. Agent KB addresses a core limitation: agents traditionally cannot learn from each other's experiences. By capturing both high-level strategies and detailed execution logs, Agent KB creates a shared knowledge base that enables cross-agent knowledge transfer. Evaluated on the GAIA benchmark, Agent KB improves success rates by up to 16.28 percentage points. On the most challenging tasks, Claude-3 improves from 38.46% to 57.69%, while GPT-4 improves from 53.49% to 73.26% on intermediate tasks. On SWE-bench code repair, Agent KB enables Claude-3 to improve from 41.33% to 53.33%. Our results suggest that Agent KB provides a modular, framework-agnostic infrastructure for enabling agents to learn from past experiences and generalize successful strategies to new tasks.

Via

Access Paper or Ask Questions

TaskCraft: Automated Generation of Agentic Tasks

Jun 11, 2025

Dingfeng Shi, Jingyi Cao, Qianben Chen, Weichen Sun, Weizhen Li, Hongxuan Lu, Fangchen Dong, Tianrui Qin, King Zhu, Minghao Yang(+7 more)

Abstract:Agentic tasks, which require multi-step problem solving with autonomy, tool use, and adaptive reasoning, are becoming increasingly central to the advancement of NLP and AI. However, existing instruction data lacks tool interaction, and current agentic benchmarks rely on costly human annotation, limiting their scalability. We introduce \textsc{TaskCraft}, an automated workflow for generating difficulty-scalable, multi-tool, and verifiable agentic tasks with execution trajectories. TaskCraft expands atomic tasks using depth-based and width-based extensions to create structurally and hierarchically complex challenges. Empirical results show that these tasks improve prompt optimization in the generation workflow and enhance supervised fine-tuning of agentic foundation models. We present a large-scale synthetic dataset of approximately 36,000 tasks with varying difficulty to support future research on agent tuning and evaluation.

Via

Access Paper or Ask Questions

APBench: A Unified Benchmark for Availability Poisoning Attacks and Defenses

Aug 07, 2023

Tianrui Qin, Xitong Gao, Juanjuan Zhao, Kejiang Ye, Cheng-Zhong Xu

Figure 1 for APBench: A Unified Benchmark for Availability Poisoning Attacks and Defenses

Figure 2 for APBench: A Unified Benchmark for Availability Poisoning Attacks and Defenses

Figure 3 for APBench: A Unified Benchmark for Availability Poisoning Attacks and Defenses

Figure 4 for APBench: A Unified Benchmark for Availability Poisoning Attacks and Defenses

Abstract:The efficacy of availability poisoning, a method of poisoning data by injecting imperceptible perturbations to prevent its use in model training, has been a hot subject of investigation. Previous research suggested that it was difficult to effectively counteract such poisoning attacks. However, the introduction of various defense methods has challenged this notion. Due to the rapid progress in this field, the performance of different novel methods cannot be accurately validated due to variations in experimental setups. To further evaluate the attack and defense capabilities of these poisoning methods, we have developed a benchmark -- APBench for assessing the efficacy of adversarial poisoning. APBench consists of 9 state-of-the-art availability poisoning attacks, 8 defense algorithms, and 4 conventional data augmentation techniques. We also have set up experiments with varying different poisoning ratios, and evaluated the attacks on multiple datasets and their transferability across model architectures. We further conducted a comprehensive evaluation of 2 additional attacks specifically targeting unsupervised models. Our results reveal the glaring inadequacy of existing attacks in safeguarding individual privacy. APBench is open source and available to the deep learning community: https://github.com/lafeat/apbench.

Via

Access Paper or Ask Questions

Learning the Unlearnable: Adversarial Augmentations Suppress Unlearnable Example Attacks

Mar 27, 2023

Tianrui Qin, Xitong Gao, Juanjuan Zhao, Kejiang Ye, Cheng-Zhong Xu

Figure 1 for Learning the Unlearnable: Adversarial Augmentations Suppress Unlearnable Example Attacks

Figure 2 for Learning the Unlearnable: Adversarial Augmentations Suppress Unlearnable Example Attacks

Figure 3 for Learning the Unlearnable: Adversarial Augmentations Suppress Unlearnable Example Attacks

Figure 4 for Learning the Unlearnable: Adversarial Augmentations Suppress Unlearnable Example Attacks

Abstract:Unlearnable example attacks are data poisoning techniques that can be used to safeguard public data against unauthorized use for training deep learning models. These methods add stealthy perturbations to the original image, thereby making it difficult for deep learning models to learn from these training data effectively. Current research suggests that adversarial training can, to a certain degree, mitigate the impact of unlearnable example attacks, while common data augmentation methods are not effective against such poisons. Adversarial training, however, demands considerable computational resources and can result in non-trivial accuracy loss. In this paper, we introduce the UEraser method, which outperforms current defenses against different types of state-of-the-art unlearnable example attacks through a combination of effective data augmentation policies and loss-maximizing adversarial augmentations. In stark contrast to the current SOTA adversarial training methods, UEraser uses adversarial augmentations, which extends beyond the confines of $ \ell_p $ perturbation budget assumed by current unlearning attacks and defenses. It also helps to improve the model's generalization ability, thus protecting against accuracy loss. UEraser wipes out the unlearning effect with error-maximizing data augmentations, thus restoring trained model accuracies. Interestingly, UEraser-Lite, a fast variant without adversarial augmentations, is also highly effective in preserving clean accuracies. On challenging unlearnable CIFAR-10, CIFAR-100, SVHN, and ImageNet-subset datasets produced with various attacks, it achieves results that are comparable to those obtained during clean training. We also demonstrate its efficacy against possible adaptive attacks. Our code is open source and available to the deep learning community: https://github.com/lafeat/ueraser.

* UEraser introduces adversarial augmentations to suppress unlearnable example attacks and outperforms current defenses

Via

Access Paper or Ask Questions

Flareon: Stealthy any2any Backdoor Injection via Poisoned Augmentation

Dec 20, 2022

Tianrui Qin, Xianghuan He, Xitong Gao, Yiren Zhao, Kejiang Ye, Cheng-Zhong Xu

Abstract:Open software supply chain attacks, once successful, can exact heavy costs in mission-critical applications. As open-source ecosystems for deep learning flourish and become increasingly universal, they present attackers previously unexplored avenues to code-inject malicious backdoors in deep neural network models. This paper proposes Flareon, a small, stealthy, seemingly harmless code modification that specifically targets the data augmentation pipeline with motion-based triggers. Flareon neither alters ground-truth labels, nor modifies the training loss objective, nor does it assume prior knowledge of the victim model architecture, training data, and training hyperparameters. Yet, it has a surprisingly large ramification on training -- models trained under Flareon learn powerful target-conditional (or "any2any") backdoors. The resulting models can exhibit high attack success rates for any target choices and better clean accuracies than backdoor attacks that not only seize greater control, but also assume more restrictive attack capabilities. We also demonstrate the effectiveness of Flareon against recent defenses. Flareon is fully open-source and available online to the deep learning community: https://github.com/lafeat/flareon.

Via

Access Paper or Ask Questions

Over-the-Air Computation via Broadband Channels

Jun 04, 2021

Tianrui Qin, Wanchun Liu, Branka Vucetic, Yonghui Li

Figure 1 for Over-the-Air Computation via Broadband Channels

Figure 2 for Over-the-Air Computation via Broadband Channels

Figure 3 for Over-the-Air Computation via Broadband Channels

Figure 4 for Over-the-Air Computation via Broadband Channels

Abstract:Over-the-air computation (AirComp) has been recognized as a low-latency solution for wireless sensor data fusion, where multiple sensors send their measurement signals to a receiver simultaneously for computation. Most existing work only considered performing AirComp over a single frequency channel. However, for a sensor network with a massive number of nodes, a single frequency channel may not be sufficient to accommodate the large number of sensors, and the AirComp performance will be very limited. So it is highly desirable to have more frequency channels for large-scale AirComp systems to benefit from multi-channel diversity. In this letter, we propose an $M$-frequency AirComp system, where each sensor selects a subset of the $M$ frequencies and broadcasts its signal over these channels under a certain power constraint. We derive the optimal sensors' transmission and receiver's signal processing methods separately, and develop an algorithm for joint design to achieve the best AirComp performance. Numerical results show that increasing one frequency channel can improve the AirComp performance by threefold compared to the single-frequency case.

* This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions