Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fei Mao

Theano-MPI: a Theano-based Distributed Training Framework

May 26, 2016

He Ma, Fei Mao, Graham W. Taylor

Figure 1 for Theano-MPI: a Theano-based Distributed Training Framework

Figure 2 for Theano-MPI: a Theano-based Distributed Training Framework

Figure 3 for Theano-MPI: a Theano-based Distributed Training Framework

Figure 4 for Theano-MPI: a Theano-based Distributed Training Framework

Abstract:We develop a scalable and extendable training framework that can utilize GPUs across nodes in a cluster and accelerate the training of deep learning models based on data parallelism. Both synchronous and asynchronous training are implemented in our framework, where parameter exchange among GPUs is based on CUDA-aware MPI. In this report, we analyze the convergence and capability of the framework to reduce training time when scaling the synchronous training of AlexNet and GoogLeNet from 2 GPUs to 8 GPUs. In addition, we explore novel ways to reduce the communication overhead caused by exchanging parameters. Finally, we release the framework as open-source for further research on distributed deep learning

Via

Access Paper or Ask Questions

Theano-based Large-Scale Visual Recognition with Multiple GPUs

Apr 06, 2015

Weiguang Ding, Ruoyan Wang, Fei Mao, Graham Taylor

Figure 1 for Theano-based Large-Scale Visual Recognition with Multiple GPUs

Figure 2 for Theano-based Large-Scale Visual Recognition with Multiple GPUs

Figure 3 for Theano-based Large-Scale Visual Recognition with Multiple GPUs

Abstract:In this report, we describe a Theano-based AlexNet (Krizhevsky et al., 2012) implementation and its naive data parallelism on multiple GPUs. Our performance on 2 GPUs is comparable with the state-of-art Caffe library (Jia et al., 2014) run on 1 GPU. To the best of our knowledge, this is the first open-source Python-based AlexNet implementation to-date.

* ICLR 2015 workshop camera-ready version

Via

Access Paper or Ask Questions