Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Enabling Ensemble Learning for Heterogeneous Large Language Models with Deep Parallel Collaboration

Apr 19, 2024

Yichong Huang, Xiaocheng Feng, Baohang Li, Yang Xiang, Hui Wang, Bing Qin, Ting Liu

Figure 1 for Enabling Ensemble Learning for Heterogeneous Large Language Models with Deep Parallel Collaboration

Figure 2 for Enabling Ensemble Learning for Heterogeneous Large Language Models with Deep Parallel Collaboration

Figure 3 for Enabling Ensemble Learning for Heterogeneous Large Language Models with Deep Parallel Collaboration

Figure 4 for Enabling Ensemble Learning for Heterogeneous Large Language Models with Deep Parallel Collaboration

Share this with someone who'll enjoy it:

Abstract:Large language models (LLMs) have shown complementary strengths in various tasks and instances, motivating the research of ensembling LLMs to push the frontier leveraging the wisdom of the crowd. Existing work achieves this objective via training the extra reward model or fusion model to select or fuse all candidate answers. However, these methods pose a great challenge to the generalizability of the trained models. Besides, existing methods use the textual responses as communication media, ignoring the rich information in the inner representations of neural networks. Therefore, we propose a training-free ensemble framework DEEPEN, averaging the probability distributions outputted by different LLMs. A key challenge in this paradigm is the vocabulary discrepancy between heterogeneous LLMs, which hinders the operation of probability distribution averaging. To address this challenge, DEEPEN maps the probability distribution of each model from the probability space to a universe relative space based on the relative representation theory, and performs aggregation. Then, the result of aggregation is mapped back to the probability space of one LLM via a search-based inverse transformation to determine the generated token. We conduct experiments on the ensemble of various LLMs of 6B to 70B. Experimental results show that DEEPEN achieves consistent improvements across six popular benchmarks involving subject examination, reasoning and knowledge-QA, proving the effectiveness of our approach.

* 12 pages, 5 figures

View paper on

Share this with someone who'll enjoy it:

Title:Enabling Ensemble Learning for Heterogeneous Large Language Models with Deep Parallel Collaboration

Paper and Code