Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Denkowski

Sockeye 3: Fast Neural Machine Translation with PyTorch

Jul 12, 2022

Felix Hieber, Michael Denkowski, Tobias Domhan, Barbara Darques Barros, Celina Dong Ye, Xing Niu, Cuong Hoang, Ke Tran, Benjamin Hsu, Maria Nadejde(+4 more)

Figure 1 for Sockeye 3: Fast Neural Machine Translation with PyTorch

Figure 2 for Sockeye 3: Fast Neural Machine Translation with PyTorch

Figure 3 for Sockeye 3: Fast Neural Machine Translation with PyTorch

Figure 4 for Sockeye 3: Fast Neural Machine Translation with PyTorch

Abstract:Sockeye 3 is the latest version of the Sockeye toolkit for Neural Machine Translation (NMT). Now based on PyTorch, Sockeye 3 provides faster model implementations and more advanced features with a further streamlined codebase. This enables broader experimentation with faster iteration, efficient training of stronger and faster models, and the flexibility to move new ideas quickly from research to production. When running comparable models, Sockeye 3 is up to 126% faster than other PyTorch implementations on GPUs and up to 292% faster on CPUs. Sockeye 3 is open source software released under the Apache 2.0 license.

Via

Access Paper or Ask Questions

The Sockeye 2 Neural Machine Translation Toolkit at AMTA 2020

Aug 11, 2020

Tobias Domhan, Michael Denkowski, David Vilar, Xing Niu, Felix Hieber, Kenneth Heafield

Figure 1 for The Sockeye 2 Neural Machine Translation Toolkit at AMTA 2020

Figure 2 for The Sockeye 2 Neural Machine Translation Toolkit at AMTA 2020

Figure 3 for The Sockeye 2 Neural Machine Translation Toolkit at AMTA 2020

Figure 4 for The Sockeye 2 Neural Machine Translation Toolkit at AMTA 2020

Abstract:We present Sockeye 2, a modernized and streamlined version of the Sockeye neural machine translation (NMT) toolkit. New features include a simplified code base through the use of MXNet's Gluon API, a focus on state of the art model architectures, distributed mixed precision training, and efficient CPU decoding with 8-bit quantization. These improvements result in faster training and inference, higher automatic metric scores, and a shorter path from research to production.

Via

Access Paper or Ask Questions

Sockeye: A Toolkit for Neural Machine Translation

Jun 01, 2018

Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar, Artem Sokolov, Ann Clifton, Matt Post

Figure 1 for Sockeye: A Toolkit for Neural Machine Translation

Figure 2 for Sockeye: A Toolkit for Neural Machine Translation

Figure 3 for Sockeye: A Toolkit for Neural Machine Translation

Figure 4 for Sockeye: A Toolkit for Neural Machine Translation

Abstract:We describe Sockeye (version 1.12), an open-source sequence-to-sequence toolkit for Neural Machine Translation (NMT). Sockeye is a production-ready framework for training and applying models as well as an experimental platform for researchers. Written in Python and built on MXNet, the toolkit offers scalable training and inference for the three most prominent encoder-decoder architectures: attentional recurrent neural networks, self-attentional transformers, and fully convolutional networks. Sockeye also supports a wide range of optimizers, normalization and regularization techniques, and inference improvements from current NMT literature. Users can easily run standard training recipes, explore different model settings, and incorporate new ideas. In this paper, we highlight Sockeye's features and benchmark it against other NMT toolkits on two language arcs from the 2017 Conference on Machine Translation (WMT): English-German and Latvian-English. We report competitive BLEU scores across all three architectures, including an overall best score for Sockeye's transformer implementation. To facilitate further comparison, we release all system outputs and training scripts used in our experiments. The Sockeye toolkit is free software released under the Apache 2.0 license.

Via

Access Paper or Ask Questions

Bi-Directional Neural Machine Translation with Synthetic Parallel Data

May 30, 2018

Xing Niu, Michael Denkowski, Marine Carpuat

Figure 1 for Bi-Directional Neural Machine Translation with Synthetic Parallel Data

Figure 2 for Bi-Directional Neural Machine Translation with Synthetic Parallel Data

Figure 3 for Bi-Directional Neural Machine Translation with Synthetic Parallel Data

Figure 4 for Bi-Directional Neural Machine Translation with Synthetic Parallel Data

Abstract:Despite impressive progress in high-resource settings, Neural Machine Translation (NMT) still struggles in low-resource and out-of-domain scenarios, often failing to match the quality of phrase-based translation. We propose a novel technique that combines back-translation and multilingual NMT to improve performance in these difficult cases. Our technique trains a single model for both directions of a language pair, allowing us to back-translate source or target monolingual data without requiring an auxiliary model. We then continue training on the augmented parallel data, enabling a cycle of improvement for a single model that can incorporate any source, target, or parallel data to improve both translation directions. As a byproduct, these models can reduce training and deployment costs significantly compared to uni-directional models. Extensive experiments show that our technique outperforms standard back-translation in low-resource scenarios, improves quality on cross-domain tasks, and effectively reduces costs across the board.

* Accepted at the 2nd Workshop on Neural Machine Translation and Generation (WNMT 2018)

Via

Access Paper or Ask Questions

Stronger Baselines for Trustable Results in Neural Machine Translation

Jun 29, 2017

Michael Denkowski, Graham Neubig

Figure 1 for Stronger Baselines for Trustable Results in Neural Machine Translation

Figure 2 for Stronger Baselines for Trustable Results in Neural Machine Translation

Figure 3 for Stronger Baselines for Trustable Results in Neural Machine Translation

Figure 4 for Stronger Baselines for Trustable Results in Neural Machine Translation

Abstract:Interest in neural machine translation has grown rapidly as its effectiveness has been demonstrated across language and data scenarios. New research regularly introduces architectural and algorithmic improvements that lead to significant gains over "vanilla" NMT implementations. However, these new techniques are rarely evaluated in the context of previously published techniques, specifically those that are widely used in state-of-theart production and shared-task systems. As a result, it is often difficult to determine whether improvements from research will carry over to systems deployed for real-world use. In this work, we recommend three specific methods that are relatively easy to implement and result in much stronger experimental systems. Beyond reporting significantly higher BLEU scores, we conduct an in-depth analysis of where improvements originate and what inherent weaknesses of basic NMT models are being addressed. We then compare the relative gains afforded by several other techniques proposed in the literature when starting with vanilla systems versus our stronger baselines, showing that experimental conclusions may change depending on the baseline chosen. This indicates that choosing a strong baseline is crucial for reporting reliable experimental results.

* To appear at the Workshop on Neural Machine Translation (WNMT)

Via

Access Paper or Ask Questions