Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Julian Schamper

Generalizing Back-Translation in Neural Machine Translation

Jun 17, 2019

Miguel Graça, Yunsu Kim, Julian Schamper, Shahram Khadivi, Hermann Ney

Figure 1 for Generalizing Back-Translation in Neural Machine Translation

Figure 2 for Generalizing Back-Translation in Neural Machine Translation

Figure 3 for Generalizing Back-Translation in Neural Machine Translation

Figure 4 for Generalizing Back-Translation in Neural Machine Translation

Abstract:Back-translation - data augmentation by translating target monolingual data - is a crucial component in modern neural machine translation (NMT). In this work, we reformulate back-translation in the scope of cross-entropy optimization of an NMT model, clarifying its underlying mathematical assumptions and approximations beyond its heuristic usage. Our formulation covers broader synthetic data generation schemes, including sampling from a target-to-source NMT model. With this formulation, we point out fundamental problems of the sampling-based approaches and propose to remedy them by (i) disabling label smoothing for the target-to-source model and (ii) sampling from a restricted search space. Our statements are investigated on the WMT 2018 German - English news translation task.

* 4th Conference on Machine Translation (WMT 2019) camera-ready

Via

Access Paper or Ask Questions

Unsupervised Training for Large Vocabulary Translation Using Sparse Lexicon and Word Classes

Jan 06, 2019

Yunsu Kim, Julian Schamper, Hermann Ney

Figure 1 for Unsupervised Training for Large Vocabulary Translation Using Sparse Lexicon and Word Classes

Figure 2 for Unsupervised Training for Large Vocabulary Translation Using Sparse Lexicon and Word Classes

Figure 3 for Unsupervised Training for Large Vocabulary Translation Using Sparse Lexicon and Word Classes

Figure 4 for Unsupervised Training for Large Vocabulary Translation Using Sparse Lexicon and Word Classes

Abstract:We address for the first time unsupervised training for a translation task with hundreds of thousands of vocabulary words. We scale up the expectation-maximization (EM) algorithm to learn a large translation table without any parallel text or seed lexicon. First, we solve the memory bottleneck and enforce the sparsity with a simple thresholding scheme for the lexicon. Second, we initialize the lexicon training with word classes, which efficiently boosts the performance. Our methods produced promising results on two large-scale unsupervised translation tasks.

* Published in EACL 2017

Via

Access Paper or Ask Questions