Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models

Nov 12, 2023

Sean Xie, Soroush Vosoughi, Saeed Hassanpour

Figure 1 for Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models

Figure 2 for Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models

Figure 3 for Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models

Figure 4 for Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models

Share this with someone who'll enjoy it:

Abstract:Large Language Models (LLMs) have significantly advanced the field of Natural Language Processing (NLP), but their lack of interpretability has been a major concern. Current methods for interpreting LLMs are post hoc, applied after inference time, and have limitations such as their focus on low-level features and lack of explainability at higher level text units. In this work, we introduce proto-lm, a prototypical network-based white-box framework that allows LLMs to learn immediately interpretable embeddings during the fine-tuning stage while maintaining competitive performance. Our method's applicability and interpretability are demonstrated through experiments on a wide range of NLP tasks, and our results indicate a new possibility of creating interpretable models without sacrificing performance. This novel approach to interpretability in LLMs can pave the way for more interpretable models without the need to sacrifice performance.

* Accepted to the Findings of EMNLP 2023

View paper on

Share this with someone who'll enjoy it:

Title:Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models

Paper and Code