Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Benoit Jacob

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Feb 21, 2019

Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu(+81 more)

Figure 1 for Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Figure 2 for Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Figure 3 for Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Abstract:Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Lingvo models are composed of modular building blocks that are flexible and easily extensible, and experiment configurations are centralized and highly customizable. Distributed training and quantized inference are supported directly within the framework, and it contains existing implementations of a large number of utilities, helper functions, and the newest research ideas. Lingvo has been used in collaboration by dozens of researchers in more than 20 papers over the last two years. This document outlines the underlying design of Lingvo and serves as an introduction to the various pieces of the framework, while also offering examples of advanced features that showcase the capabilities of the framework.

Via

Access Paper or Ask Questions

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Dec 15, 2017

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, Dmitry Kalenichenko

Figure 1 for Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Figure 2 for Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Figure 3 for Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Figure 4 for Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Abstract:The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware. We also co-design a training procedure to preserve end-to-end model accuracy post quantization. As a result, the proposed quantization scheme improves the tradeoff between accuracy and on-device latency. The improvements are significant even on MobileNets, a model family known for run-time efficiency, and are demonstrated in ImageNet classification and COCO detection on popular CPUs.

* 14 pages, 12 figures

Via

Access Paper or Ask Questions