Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Amelia Archer

Small and Practical BERT Models for Sequence Labeling

Aug 31, 2019

Henry Tsai, Jason Riesa, Melvin Johnson, Naveen Arivazhagan, Xin Li, Amelia Archer

Figure 1 for Small and Practical BERT Models for Sequence Labeling

Figure 2 for Small and Practical BERT Models for Sequence Labeling

Figure 3 for Small and Practical BERT Models for Sequence Labeling

Figure 4 for Small and Practical BERT Models for Sequence Labeling

Abstract:We propose a practical scheme to train a single multilingual sequence labeling model that yields state of the art results and is small and fast enough to run on a single CPU. Starting from a public multilingual BERT checkpoint, our final model is 6x smaller and 27x faster, and has higher accuracy than a state-of-the-art multilingual baseline. We show that our model especially outperforms on low-resource languages, and works on codemixed input text without being explicitly trained on codemixed examples. We showcase the effectiveness of our method by reporting on part-of-speech tagging and morphological prediction on 70 treebanks and 48 languages.

* 11 pages including appendices; accepted to appear at EMNLP-IJCNLP 2019

Via

Access Paper or Ask Questions