Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jon Dehdari

Massively Multilingual Neural Grapheme-to-Phoneme Conversion

Aug 04, 2017

Ben Peters, Jon Dehdari, Josef van Genabith

Figure 1 for Massively Multilingual Neural Grapheme-to-Phoneme Conversion

Figure 2 for Massively Multilingual Neural Grapheme-to-Phoneme Conversion

Figure 3 for Massively Multilingual Neural Grapheme-to-Phoneme Conversion

Figure 4 for Massively Multilingual Neural Grapheme-to-Phoneme Conversion

Abstract:Grapheme-to-phoneme conversion (g2p) is necessary for text-to-speech and automatic speech recognition systems. Most g2p systems are monolingual: they require language-specific data or handcrafting of rules. Such systems are difficult to extend to low resource languages, for which data and handcrafted rules are not available. As an alternative, we present a neural sequence-to-sequence approach to g2p which is trained on spelling--pronunciation pairs in hundreds of languages. The system shares a single encoder and decoder across all languages, allowing it to utilize the intrinsic similarities between different writing systems. We show an 11% improvement in phoneme error rate over an approach based on adapting high-resource monolingual g2p models to low-resource languages. Our model is also much more compact relative to previous approaches.

* EMNLP 2017 Workshop on Building Linguisically Generalizable NLP Systems

Via

Access Paper or Ask Questions