Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Classifying Scientific Publications with BERT -- Is Self-Attention a Feature Selection Method?

Jan 20, 2021

Andres Garcia-Silva, Jose Manuel Gomez-Perez

Figure 1 for Classifying Scientific Publications with BERT -- Is Self-Attention a Feature Selection Method?

Figure 2 for Classifying Scientific Publications with BERT -- Is Self-Attention a Feature Selection Method?

Figure 3 for Classifying Scientific Publications with BERT -- Is Self-Attention a Feature Selection Method?

Figure 4 for Classifying Scientific Publications with BERT -- Is Self-Attention a Feature Selection Method?

Share this with someone who'll enjoy it:

Abstract:We investigate the self-attention mechanism of BERT in a fine-tuning scenario for the classification of scientific articles over a taxonomy of research disciplines. We observe how self-attention focuses on words that are highly related to the domain of the article. Particularly, a small subset of vocabulary words tends to receive most of the attention. We compare and evaluate the subset of the most attended words with feature selection methods normally used for text classification in order to characterize self-attention as a possible feature selection approach. Using ConceptNet as ground truth, we also find that attended words are more related to the research fields of the articles. However, conventional feature selection methods are still a better option to learn classifiers from scratch. This result suggests that, while self-attention identifies domain-relevant terms, the discriminatory information in BERT is encoded in the contextualized outputs and the classification layer. It also raises the question whether injecting feature selection methods in the self-attention mechanism could further optimize single sequence classification using transformers.

* Paper accepted for publication at ECIR2021

View paper on

Share this with someone who'll enjoy it:

Title:Classifying Scientific Publications with BERT -- Is Self-Attention a Feature Selection Method?

Paper and Code