Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems

Feb 06, 2025

Shangbin Feng, Zifeng Wang, Palash Goyal, Yike Wang, Weijia Shi, Huang Xia, Hamid Palangi, Luke Zettlemoyer, Yulia Tsvetkov, Chen-Yu Lee(+1 more)

Figure 1 for Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems

Figure 2 for Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems

Figure 3 for Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems

Figure 4 for Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems

Share this with someone who'll enjoy it:

Abstract:We propose Heterogeneous Swarms, an algorithm to design multi-LLM systems by jointly optimizing model roles and weights. We represent multi-LLM systems as directed acyclic graphs (DAGs) of LLMs with topological message passing for collaborative generation. Given a pool of LLM experts and a utility function, Heterogeneous Swarms employs two iterative steps: role-step and weight-step. For role-step, we interpret model roles as learning a DAG that specifies the flow of inputs and outputs between LLMs. Starting from a swarm of random continuous adjacency matrices, we decode them into discrete DAGs, call the LLMs in topological order, evaluate on the utility function (e.g. accuracy on a task), and optimize the adjacency matrices with particle swarm optimization based on the utility score. For weight-step, we assess the contribution of individual LLMs in the multi-LLM systems and optimize model weights with swarm intelligence. We propose JFK-score to quantify the individual contribution of each LLM in the best-found DAG of the role-step, then optimize model weights with particle swarm optimization based on the JFK-score. Experiments demonstrate that Heterogeneous Swarms outperforms 15 role- and/or weight-based baselines by 18.5% on average across 12 tasks. Further analysis reveals that Heterogeneous Swarms discovers multi-LLM systems with heterogeneous model roles and substantial collaborative gains, and benefits from the diversity of language models.

View paper on

Share this with someone who'll enjoy it:

Title:Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems

Paper and Code