Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Liwei Peng

StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding

Aug 16, 2019

Wei Wang, Bin Bi, Ming Yan, Chen Wu, Zuyi Bao, Liwei Peng, Luo Si

Figure 1 for StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding

Figure 2 for StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding

Figure 3 for StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding

Figure 4 for StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding

Abstract:Recently, the pre-trained language model, BERT, has attracted a lot of attention in natural language understanding (NLU), and achieved state-of-the-art accuracy in various NLU tasks, such as sentiment classification, natural language inference, semantic textual similarity and question answering. Inspired by the linearization exploration work of Elman, we extend BERT to a new model, StructBERT, by incorporating language structures into pre-training. Specifically, we pre-train StructBERT with two auxiliary tasks to make the most of the sequential order of words and sentences, which leverage language structures at the word and sentence levels, respectively. As a result, the new model is adapted to different levels of language understanding required by downstream tasks. The StructBERT with structural pre-training gives surprisingly good empirical results on a variety of downstream tasks, including pushing the state-of-the-art on the GLUE benchmark to 84.5 (with Top 1 achievement on the Leaderboard at the time of paper submission), the F1 score on SQuAD v1.1 question answering to 93.0, the accuracy on SNLI to 91.7.

* 10 Pages

Via

Access Paper or Ask Questions