Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Patrik Zavoral

Adversarial Testing as a Tool for Interpretability: Length-based Overfitting of Elementary Functions in Transformers

Oct 17, 2024

Patrik Zavoral, Dušan Variš, Ondřej Bojar

Figure 1 for Adversarial Testing as a Tool for Interpretability: Length-based Overfitting of Elementary Functions in Transformers

Figure 2 for Adversarial Testing as a Tool for Interpretability: Length-based Overfitting of Elementary Functions in Transformers

Figure 3 for Adversarial Testing as a Tool for Interpretability: Length-based Overfitting of Elementary Functions in Transformers

Figure 4 for Adversarial Testing as a Tool for Interpretability: Length-based Overfitting of Elementary Functions in Transformers

Abstract:The Transformer model has a tendency to overfit various aspects of the training data, such as the overall sequence length. We study elementary string edit functions using a defined set of error indicators to interpret the behaviour of the sequence-to-sequence Transformer. We show that generalization to shorter sequences is often possible, but confirm that longer sequences are highly problematic, although partially correct answers are often obtained. Additionally, we find that other structural characteristics of the sequences, such as subsegment length, may be equally important. We hypothesize that the models learn algorithmic aspects of the tasks simultaneously with structural aspects but adhering to the structural aspects is unfortunately often preferred by Transformer when they come into conflict.

* 9 pages, 8 figures, 2 tables; to be published

Via

Access Paper or Ask Questions