Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:In-Context Learning in Large Language Models: A Neuroscience-inspired Analysis of Representations

Oct 18, 2023

Safoora Yousefi, Leo Betthauser, Hosein Hasanbeig, Akanksha Saran, Raphaël Millière, Ida Momennejad

Figure 1 for In-Context Learning in Large Language Models: A Neuroscience-inspired Analysis of Representations

Figure 2 for In-Context Learning in Large Language Models: A Neuroscience-inspired Analysis of Representations

Figure 3 for In-Context Learning in Large Language Models: A Neuroscience-inspired Analysis of Representations

Figure 4 for In-Context Learning in Large Language Models: A Neuroscience-inspired Analysis of Representations

Share this with someone who'll enjoy it:

Abstract:Large language models (LLMs) exhibit remarkable performance improvement through in-context learning (ICL) by leveraging task-specific examples in the input. However, the mechanisms behind this improvement remain elusive. In this work, we investigate embeddings and attention representations in Llama-2 70B and Vicuna 13B. Specifically, we study how embeddings and attention change after in-context-learning, and how these changes mediate improvement in behavior. We employ neuroscience-inspired techniques, such as representational similarity analysis (RSA), and propose novel methods for parameterized probing and attention ratio analysis (ARA, measuring the ratio of attention to relevant vs. irrelevant information). We designed three tasks with a priori relationships among their conditions: reading comprehension, linear regression, and adversarial prompt injection. We formed hypotheses about expected similarities in task representations to investigate latent changes in embeddings and attention. Our analyses revealed a meaningful correlation between changes in both embeddings and attention representations with improvements in behavioral performance after ICL. This empirical framework empowers a nuanced understanding of how latent representations affect LLM behavior with and without ICL, offering valuable tools and insights for future research and practical applications.

* Added overview figures 1-3 in this version

View paper on

Share this with someone who'll enjoy it:

Title:In-Context Learning in Large Language Models: A Neuroscience-inspired Analysis of Representations

Paper and Code