Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Applying the Transformer to Character-level Transduction

May 20, 2020

Shijie Wu, Ryan Cotterell, Mans Hulden

Figure 1 for Applying the Transformer to Character-level Transduction

Figure 2 for Applying the Transformer to Character-level Transduction

Figure 3 for Applying the Transformer to Character-level Transduction

Figure 4 for Applying the Transformer to Character-level Transduction

Share this with someone who'll enjoy it:

Abstract:The transformer has been shown to outperform recurrent neural network-based sequence-to-sequence models in various word-level NLP tasks. The model offers other benefits as well: It trains faster and has fewer parameters. Yet for character-level transduction tasks, e.g. morphological inflection generation and historical text normalization, few shows success on outperforming recurrent models with the transformer. In an empirical study, we uncover that, in contrast to recurrent sequence-to-sequence models, the batch size plays a crucial role in the performance of the transformer on character-level tasks, and we show that with a large enough batch size, the transformer does indeed outperform recurrent models. We also introduce a simple technique to handle feature-guided character-level transduction that further improves performance. With these insights, we achieve state-of-the-art performance on morphological inflection and historical text normalization. We also show that the transformer outperforms a strong baseline on two other character-level transduction tasks: grapheme-to-phoneme conversion and transliteration. Code is available at https://github.com/shijie-wu/neural-transducer.

View paper on

Share this with someone who'll enjoy it:

Title:Applying the Transformer to Character-level Transduction

Paper and Code