Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Are ELECTRA's Sentence Embeddings Beyond Repair? The Case of Semantic Textual Similarity

Feb 20, 2024

Ivan Rep, David Dukić, Jan Šnajder

Figure 1 for Are ELECTRA's Sentence Embeddings Beyond Repair? The Case of Semantic Textual Similarity

Figure 2 for Are ELECTRA's Sentence Embeddings Beyond Repair? The Case of Semantic Textual Similarity

Figure 3 for Are ELECTRA's Sentence Embeddings Beyond Repair? The Case of Semantic Textual Similarity

Figure 4 for Are ELECTRA's Sentence Embeddings Beyond Repair? The Case of Semantic Textual Similarity

Share this with someone who'll enjoy it:

Abstract:While BERT produces high-quality sentence embeddings, its pre-training computational cost is a significant drawback. In contrast, ELECTRA delivers a cost-effective pre-training objective and downstream task performance improvements, but not as performant sentence embeddings. The community tacitly stopped utilizing ELECTRA's sentence embeddings for semantic textual similarity (STS). We notice a significant drop in performance when using the ELECTRA discriminator's last layer in comparison to earlier layers. We explore this drop and devise a way to repair ELECTRA's embeddings, proposing a novel truncated model fine-tuning (TMFT) method. TMFT improves the Spearman correlation coefficient by over 8 points while increasing parameter efficiency on the STS benchmark dataset. We extend our analysis to various model sizes and languages. Further, we discover the surprising efficacy of ELECTRA's generator model, which performs on par with BERT, using significantly fewer parameters and a substantially smaller embedding size. Finally, we observe further boosts by combining TMFT with a word similarity task or domain adaptive pre-training.

* 7 pages, 9 figures, 2 tables

View paper on

Share this with someone who'll enjoy it:

Title:Are ELECTRA's Sentence Embeddings Beyond Repair? The Case of Semantic Textual Similarity

Paper and Code