Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers

May 22, 2024

Tobias Leemann, Alina Fastowski, Felix Pfeiffer, Gjergji Kasneci

Figure 1 for Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers

Figure 2 for Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers

Figure 3 for Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers

Figure 4 for Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers

Share this with someone who'll enjoy it:

Abstract:We address the critical challenge of applying feature attribution methods to the transformer architecture, which dominates current applications in natural language processing and beyond. Traditional attribution methods to explainable AI (XAI) explicitly or implicitly rely on linear or additive surrogate models to quantify the impact of input features on a model's output. In this work, we formally prove an alarming incompatibility: transformers are structurally incapable to align with popular surrogate models for feature attribution, undermining the grounding of these conventional explanation methodologies. To address this discrepancy, we introduce the Softmax-Linked Additive Log-Odds Model (SLALOM), a novel surrogate model specifically designed to align with the transformer framework. Unlike existing methods, SLALOM demonstrates the capacity to deliver a range of faithful and insightful explanations across both synthetic and real-world datasets. Showing that diverse explanations computed from SLALOM outperform common surrogate explanations on different tasks, we highlight the need for task-specific feature attributions rather than a one-size-fits-all approach.

View paper on

Share this with someone who'll enjoy it:

Title:Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers

Paper and Code