Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Haibo Shi

A Dual-Task Synergy-Driven Generalization Framework for Pancreatic Cancer Segmentation in CT Scans

May 03, 2025

Jun Li, Yijue Zhang, Haibo Shi, Minhong Li, Qiwei Li, Xiaohua Qian

Abstract:Pancreatic cancer, characterized by its notable prevalence and mortality rates, demands accurate lesion delineation for effective diagnosis and therapeutic interventions. The generalizability of extant methods is frequently compromised due to the pronounced variability in imaging and the heterogeneous characteristics of pancreatic lesions, which may mimic normal tissues and exhibit significant inter-patient variability. Thus, we propose a generalization framework that synergizes pixel-level classification and regression tasks, to accurately delineate lesions and improve model stability. This framework not only seeks to align segmentation contours with actual lesions but also uses regression to elucidate spatial relationships between diseased and normal tissues, thereby improving tumor localization and morphological characterization. Enhanced by the reciprocal transformation of task outputs, our approach integrates additional regression supervision within the segmentation context, bolstering the model's generalization ability from a dual-task perspective. Besides, dual self-supervised learning in feature spaces and output spaces augments the model's representational capability and stability across different imaging views. Experiments on 594 samples composed of three datasets with significant imaging differences demonstrate that our generalized pancreas segmentation results comparable to mainstream in-domain validation performance (Dice: 84.07%). More importantly, it successfully improves the results of the highly challenging cross-lesion generalized pancreatic cancer segmentation task by 9.51%. Thus, our model constitutes a resilient and efficient foundational technological support for pancreatic disease management and wider medical applications. The codes will be released at https://github.com/SJTUBME-QianLab/Dual-Task-Seg.

* accept by IEEE Transactions on Medical Imaging (TMI) 2025

Via

Access Paper or Ask Questions

ATM: Adversarial Tuning Multi-agent System Makes a Robust Retrieval-Augmented Generator

May 28, 2024

Junda Zhu, Lingyong Yan, Haibo Shi, Dawei Yin, Lei Sha

Abstract:Large language model (LLM) has proven to benefit a lot from retrieval augmentation in alleviating hallucinations confronted with knowledge-intensive questions. Retrieval-augmented generation (RAG) adopts IR-based techniques utilizing semantic-relevant documents as the generator's input context and realizes external knowledge injection. However, on today's Internet which is flooded with content generated by LLMs, there are too many "related yet useless" documents or even fake knowledge fabricated by LLMs, which will introduce extra noise to the generator and distract it from giving correct results. To this end, we regard the training of the RAG generator model as a multi-agent adversarial-defensive system, guiding the generator to have a better taste of whether a specific document helps answer the question through the Adversarial Tuning in a Multi-agent (ATM) system to strengthen the generator's robustness in an RAG pipeline. After rounds of multi-agent iterative tuning, we find that the ATM Generator can eventually discriminate useful documents amongst LLM fabrications and achieve better performance than strong baselines.

* 16 pages

Via

Access Paper or Ask Questions

Chain of Tools: Large Language Model is an Automatic Multi-tool Learner

May 26, 2024

Zhengliang Shi, Shen Gao, Xiuyi Chen, Yue Feng, Lingyong Yan, Haibo Shi, Dawei Yin, Zhumin Chen, Suzan Verberne, Zhaochun Ren

Abstract:Augmenting large language models (LLMs) with external tools has emerged as a promising approach to extend their utility, empowering them to solve practical tasks. Existing work typically empowers LLMs as tool users with a manually designed workflow, where the LLM plans a series of tools in a step-by-step manner, and sequentially executes each tool to obtain intermediate results until deriving the final answer. However, they suffer from two challenges in realistic scenarios: (1) The handcrafted control flow is often ad-hoc and constraints the LLM to local planning; (2) The LLM is instructed to use only manually demonstrated tools or well-trained Python functions, which limits its generalization to new tools. In this work, we first propose Automatic Tool Chain (ATC), a framework that enables the LLM to act as a multi-tool user, which directly utilizes a chain of tools through programming. To scale up the scope of the tools, we next propose a black-box probing method. This further empowers the LLM as a tool learner that can actively discover and document tool usages, teaching themselves to properly master new tools. For a comprehensive evaluation, we build a challenging benchmark named ToolFlow, which diverges from previous benchmarks by its long-term planning scenarios and complex toolset. Experiments on both existing datasets and ToolFlow illustrate the superiority of our framework. Analysis on different settings also validates the effectiveness and the utility of our black-box probing algorithm.

* Work in progress

Via

Access Paper or Ask Questions

GOVERN: Gradient Orientation Vote Ensemble for Multi-Teacher Reinforced Distillation

May 06, 2024

Wenjie Zhou, Zhenxin Ding, Xiaodong Zhang, Haibo Shi, Junfeng Wang, Dawei Yin

Abstract:Pre-trained language models have become an integral component of question-answering systems, achieving remarkable performance. For practical deployment, it is critical to carry out knowledge distillation to preserve high performance under computational constraints. In this paper, we address a key question: given the importance of unsupervised distillation for student performance, how does one effectively ensemble knowledge from multiple teachers at this stage without the guidance of ground-truth labels? We propose a novel algorithm, GOVERN, to tackle this issue. GOVERN has demonstrated significant improvements in both offline and online experiments. The proposed algorithm has been successfully deployed in a real-world commercial question-answering system.

Via

Access Paper or Ask Questions

The Real, the Better: Aligning Large Language Models with Online Human Behaviors

May 01, 2024

Guanying Jiang, Lingyong Yan, Haibo Shi, Dawei Yin

Abstract:Large language model alignment is widely used and studied to avoid LLM producing unhelpful and harmful responses. However, the lengthy training process and predefined preference bias hinder adaptation to online diverse human preferences. To this end, this paper proposes an alignment framework, called Reinforcement Learning with Human Behavior (RLHB), to align LLMs by directly leveraging real online human behaviors. By taking the generative adversarial framework, the generator is trained to respond following expected human behavior; while the discriminator tries to verify whether the triplets of query, response, and human behavior come from real online environments. Behavior modeling in natural-language form and the multi-model joint training mechanism enable an active and sustainable online alignment. Experimental results confirm the effectiveness of our proposed methods by both human and automatic evaluations.

* 11 pages, 6 figures

Via

Access Paper or Ask Questions

Learning to Use Tools via Cooperative and Interactive Agents

Mar 05, 2024

Zhengliang Shi, Shen Gao, Xiuyi Chen, Lingyong Yan, Haibo Shi, Dawei Yin, Zhumin Chen, Pengjie Ren, Suzan Verberne, Zhaochun Ren

Figure 1 for Learning to Use Tools via Cooperative and Interactive Agents

Figure 2 for Learning to Use Tools via Cooperative and Interactive Agents

Figure 3 for Learning to Use Tools via Cooperative and Interactive Agents

Figure 4 for Learning to Use Tools via Cooperative and Interactive Agents

Abstract:Tool learning empowers large language models (LLMs) as agents to use external tools to extend their capability. Existing methods employ one single LLM-based agent to iteratively select and execute tools, thereafter incorporating the result into the next action prediction. However, they still suffer from potential performance degradation when addressing complex tasks due to: (1) the limitation of the inherent capability of a single LLM to perform diverse actions, and (2) the struggle to adaptively correct mistakes when the task fails. To mitigate these problems, we propose the ConAgents, a Cooperative and interactive Agents framework, which modularizes the workflow of tool learning into Grounding, Execution, and Observing agents. We also introduce an iterative calibration (IterCali) method, enabling the agents to adapt themselves based on the feedback from the tool environment. Experiments conducted on three datasets demonstrate the superiority of our ConAgents (e.g., 6 point improvement over the SOTA baseline). We further provide fine-granularity analysis for the efficiency and consistency of our framework.

* 20 pages

Via

Access Paper or Ask Questions

KnowTuning: Knowledge-aware Fine-tuning for Large Language Models

Feb 17, 2024

Yougang Lyu, Lingyong Yan, Shuaiqiang Wang, Haibo Shi, Dawei Yin, Pengjie Ren, Zhumin Chen, Maarten de Rijke, Zhaochun Ren

Abstract:Despite their success at many natural language processing (NLP) tasks, large language models (LLMs) still struggle to effectively leverage knowledge for knowledge-intensive tasks, manifesting limitations such as generating incomplete, non-factual, or illogical answers. These limitations stem from inadequate knowledge awareness of LLMs during vanilla fine-tuning. To address these problems, we propose a knowledge-aware fine-tuning (KnowTuning) method to explicitly and implicitly improve the knowledge awareness of LLMs. We devise an explicit knowledge-aware generation stage to train LLMs to explicitly identify knowledge triples in answers. We also propose an implicit knowledge-aware comparison stage to train LLMs to implicitly distinguish between reliable and unreliable knowledge, in three aspects: completeness, factuality, and logicality. Extensive experiments on both generic and medical question answering (QA) datasets confirm the effectiveness of KnowTuning, through automatic and human evaluations, across various sizes of LLMs. Finally, we demonstrate that the improvements of KnowTuning generalize to unseen QA datasets.

Via

Access Paper or Ask Questions

LOIS: Looking Out of Instance Semantics for Visual Question Answering

Jul 26, 2023

Siyu Zhang, Yeming Chen, Yaoru Sun, Fang Wang, Haibo Shi, Haoran Wang

Abstract:Visual question answering (VQA) has been intensively studied as a multimodal task that requires effort in bridging vision and language to infer answers correctly. Recent attempts have developed various attention-based modules for solving VQA tasks. However, the performance of model inference is largely bottlenecked by visual processing for semantics understanding. Most existing detection methods rely on bounding boxes, remaining a serious challenge for VQA models to understand the causal nexus of object semantics in images and correctly infer contextual information. To this end, we propose a finer model framework without bounding boxes in this work, termed Looking Out of Instance Semantics (LOIS) to tackle this important issue. LOIS enables more fine-grained feature descriptions to produce visual facts. Furthermore, to overcome the label ambiguity caused by instance masks, two types of relation attention modules: 1) intra-modality and 2) inter-modality, are devised to infer the correct answers from the different multi-view features. Specifically, we implement a mutual relation attention module to model sophisticated and deeper visual semantic relations between instance objects and background information. In addition, our proposed attention model can further analyze salient image regions by focusing on important word-related questions. Experimental results on four benchmark VQA datasets prove that our proposed method has favorable performance in improving visual reasoning capability.

Via

Access Paper or Ask Questions

Research on Limited Buffer Scheduling Problems in Flexible Flow Shops with Setup Times

Dec 07, 2018

Zhonghua Han, Quan Zhang, Haibo Shi, Yuanwei Qi, Liangliang Sun

Figure 1 for Research on Limited Buffer Scheduling Problems in Flexible Flow Shops with Setup Times

Figure 2 for Research on Limited Buffer Scheduling Problems in Flexible Flow Shops with Setup Times

Figure 3 for Research on Limited Buffer Scheduling Problems in Flexible Flow Shops with Setup Times

Figure 4 for Research on Limited Buffer Scheduling Problems in Flexible Flow Shops with Setup Times

Abstract:In order to solve the limited buffer scheduling problems in flexible flow shops with setup times, this paper proposes an improved whale optimization algorithm (IWOA) as a global optimization algorithm. Firstly, this paper presents a mathematic programming model for limited buffer in flexible flow shops with setup times, and applies the IWOA algorithm as the global optimization algorithm. Based on the whale optimization algorithm (WOA), the improved algorithm uses Levy flight, opposition-based learning strategy and simulated annealing to expand the search range, enhance the ability for jumping out of local extremum, and improve the continuous evolution of the algorithm. To verify the improvement of the proposed algorithm on the optimization ability of the standard WOA algorithm, the IWOA algorithm is tested by verification examples of small-scale and large-scale flexible flow shop scheduling problems, and the imperialist competitive algorithm (ICA), bat algorithm (BA), and whale optimization algorithm (WOA) are used for comparision. Based on the instance data of bus manufacturer, simulation tests are made on the four algorithms under variouis of practical evalucation scenarios. The simulation results show that the IWOA algorithm can better solve this type of limited buffer scheduling problem in flexible flow shops with setup times compared with the state of the art algorithms.

* Accepted for publication by International Journal of Modelling, Identification and Control (IJMIC)

Via

Access Paper or Ask Questions