Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

Jun 04, 2020

Haichen Shen, Jared Roesch, Zhi Chen, Wei Chen, Yong Wu, Mu Li, Vin Sharma, Zachary Tatlock, Yida Wang

Figure 1 for Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

Figure 2 for Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

Figure 3 for Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

Figure 4 for Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

Share this with someone who'll enjoy it:

Abstract:Modern deep neural networks increasingly make use of features such as dynamic control flow, data structures and dynamic tensor shapes. Existing deep learning systems focus on optimizing and executing static neural networks which assume a pre-determined model architecture and input data shapes--assumptions which are violated by dynamic neural networks. Therefore, executing dynamic models with deep learning systems is currently both inflexible and sub-optimal, if not impossible. Optimizing dynamic neural networks is more challenging than static neural networks; optimizations must consider all possible execution paths and tensor shapes. This paper proposes Nimble, a high-performance and flexible system to optimize, compile, and execute dynamic neural networks on multiple platforms. Nimble handles model dynamism by introducing a dynamic type system, a set of dynamism-oriented optimizations, and a light-weight virtual machine runtime. Our evaluation demonstrates that Nimble outperforms state-of-the-art deep learning frameworks and runtime systems for dynamic neural networks by up to 20x on hardware platforms including Intel CPUs, ARM CPUs, and Nvidia GPUs.

View paper on

Share this with someone who'll enjoy it:

Title:Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

Paper and Code