Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Keming Ye

Optimize Incompatible Parameters through Compatibility-aware Knowledge Integration

Jan 10, 2025

Zheqi Lv, Keming Ye, Zishu Wei, Qi Tian, Shengyu Zhang, Wenqiao Zhang, Wenjie Wang, Kun Kuang, Tat-Seng Chua, Fei Wu

Figure 1 for Optimize Incompatible Parameters through Compatibility-aware Knowledge Integration

Figure 2 for Optimize Incompatible Parameters through Compatibility-aware Knowledge Integration

Figure 3 for Optimize Incompatible Parameters through Compatibility-aware Knowledge Integration

Figure 4 for Optimize Incompatible Parameters through Compatibility-aware Knowledge Integration

Abstract:Deep neural networks have become foundational to advancements in multiple domains, including recommendation systems, natural language processing, and so on. Despite their successes, these models often contain incompatible parameters that can be underutilized or detrimental to model performance, particularly when faced with specific, varying data distributions. Existing research excels in removing such parameters or merging the outputs of multiple different pretrained models. However, the former focuses on efficiency rather than performance, while the latter requires several times more computing and storage resources to support inference. In this paper, we set the goal to explicitly improve these incompatible parameters by leveraging the complementary strengths of different models, thereby directly enhancing the models without any additional parameters. Specifically, we propose Compatibility-aware Knowledge Integration (CKI), which consists of Parameter Compatibility Assessment and Parameter Splicing, which are used to evaluate the knowledge content of multiple models and integrate the knowledge into one model, respectively. The integrated model can be used directly for inference or for further fine-tuning. We conduct extensive experiments on various datasets for recommendation and language tasks, and the results show that Compatibility-aware Knowledge Integration can effectively optimize incompatible parameters under multiple tasks and settings to break through the training limit of the original model without increasing the inference cost.

* Published on AAAI'25: The Annual AAAI Conference on Artificial Intelligence

Via

Access Paper or Ask Questions

StructGPT: A General Framework for Large Language Model to Reason over Structured Data

May 16, 2023

Jinhao Jiang, Kun Zhou, Zican Dong, Keming Ye, Wayne Xin Zhao, Ji-Rong Wen

Figure 1 for StructGPT: A General Framework for Large Language Model to Reason over Structured Data

Figure 2 for StructGPT: A General Framework for Large Language Model to Reason over Structured Data

Figure 3 for StructGPT: A General Framework for Large Language Model to Reason over Structured Data

Figure 4 for StructGPT: A General Framework for Large Language Model to Reason over Structured Data

Abstract:In this paper, we study how to improve the zero-shot reasoning ability of large language models~(LLMs) over structured data in a unified way. Inspired by the study on tool augmentation for LLMs, we develop an \emph{Iterative Reading-then-Reasoning~(IRR)} approach for solving question answering tasks based on structured data, called \textbf{StructGPT}. In our approach, we construct the specialized function to collect relevant evidence from structured data (\ie \emph{reading}), and let LLMs concentrate the reasoning task based on the collected information (\ie \emph{reasoning}). Specially, we propose an \emph{invoking-linearization-generation} procedure to support LLMs in reasoning on the structured data with the help of the external interfaces. By iterating this procedures with provided interfaces, our approach can gradually approach the target answer to a given query. Extensive experiments conducted on three types of structured data demonstrate the effectiveness of our approach, which can significantly boost the performance of ChatGPT and achieve comparable performance against the full-data supervised-tuning baselines. Our codes and data are publicly available at~\url{https://github.com/RUCAIBox/StructGPT}.

* 13 pages, working in progress

Via

Access Paper or Ask Questions