Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models

Jun 19, 2024

Ting-Yun Chang, Jesse Thomason, Robin Jia

Figure 1 for When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models

Figure 2 for When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models

Figure 3 for When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models

Figure 4 for When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models

Share this with someone who'll enjoy it:

Abstract:This paper studies in-context learning (ICL) by decomposing the output of large language models into the individual contributions of attention heads and MLPs (components). We observe curious components: good-performing ones that individually do well on a classification task, even when the model performs poorly; bad-performing ones that do much worse than chance; and label-biased components that always predict the same label. We find that component accuracies are well-correlated across different demonstration sets and perturbations of prompt templates, even when the full-model accuracy varies greatly. Based on our findings, we propose component reweighting, which learns to linearly re-scale the component activations from a few labeled examples. Given 24 labeled examples, our method improves by an average of 6.0% accuracy points over 24-shot ICL across 8 tasks on Llama-2-7B. Overall, this paper both enriches our understanding of ICL and provides a practical method for improvement by examining model internals.

View paper on

Share this with someone who'll enjoy it:

Title:When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models

Paper and Code