Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:VOLT: Improving Vocabularization via Optimal Transport for Machine Translation

Dec 31, 2020

Jingjing Xu, Hao Zhou, Chun Gan, Zaixiang Zheng, Lei Li

Figure 1 for VOLT: Improving Vocabularization via Optimal Transport for Machine Translation

Figure 2 for VOLT: Improving Vocabularization via Optimal Transport for Machine Translation

Figure 3 for VOLT: Improving Vocabularization via Optimal Transport for Machine Translation

Figure 4 for VOLT: Improving Vocabularization via Optimal Transport for Machine Translation

Share this with someone who'll enjoy it:

Abstract:It is well accepted that the choice of token vocabulary largely affects the performance of machine translation. However, due to expensive trial costs, most studies only conduct simple trials with dominant approaches (e.g BPE) and commonly used vocabulary sizes. In this paper, we find an exciting relation between an information-theoretic feature and BLEU scores. With this observation, we formulate the quest of vocabularization -- finding the best token dictionary with a proper size -- as an optimal transport problem. We then propose VOLT, a simple and efficient vocabularization solution without the full and costly trial training. We evaluate our approach on multiple machine translation tasks, including WMT-14 English-German translation, TED bilingual translation, and TED multilingual translation. Empirical results show that VOLT beats widely-used vocabularies on diverse scenarios. For example, VOLT achieves 70% vocabulary size reduction and 0.6 BLEU gain on English-German translation. Also, one advantage of VOLT lies in its low resource consumption. Compared to naive BPE-search, VOLT reduces the search time from 288 GPU hours to 0.5 CPU hours.

View paper on

Share this with someone who'll enjoy it:

Title:VOLT: Improving Vocabularization via Optimal Transport for Machine Translation

Paper and Code