Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:GraNNDis: Efficient Unified Distributed Training Framework for Deep GNNs on Large Clusters

Nov 12, 2023

Jaeyong Song, Hongsun Jang, Jaewon Jung, Youngsok Kim, Jinho Lee

Figure 1 for GraNNDis: Efficient Unified Distributed Training Framework for Deep GNNs on Large Clusters

Figure 2 for GraNNDis: Efficient Unified Distributed Training Framework for Deep GNNs on Large Clusters

Figure 3 for GraNNDis: Efficient Unified Distributed Training Framework for Deep GNNs on Large Clusters

Figure 4 for GraNNDis: Efficient Unified Distributed Training Framework for Deep GNNs on Large Clusters

Share this with someone who'll enjoy it:

Abstract:Graph neural networks (GNNs) are one of the most rapidly growing fields within deep learning. According to the growth in the dataset and the model size used for GNNs, an important problem is that it becomes nearly impossible to keep the whole network on GPU memory. Among numerous attempts, distributed training is one popular approach to address the problem. However, due to the nature of GNNs, existing distributed approaches suffer from poor scalability, mainly due to the slow external server communications. In this paper, we propose GraNNDis, an efficient distributed GNN training framework for training GNNs on large graphs and deep layers. GraNNDis introduces three new techniques. First, shared preloading provides a training structure for a cluster of multi-GPU servers. We suggest server-wise preloading of essential vertex dependencies to reduce the low-bandwidth external server communications. Second, we present expansion-aware sampling. Because shared preloading alone has limitations because of the neighbor explosion, expansion-aware sampling reduces vertex dependencies that span across server boundaries. Third, we propose cooperative batching to create a unified framework for full-graph and minibatch training. It significantly reduces redundant memory usage in mini-batch training. From this, GraNNDis enables a reasonable trade-off between full-graph and mini-batch training through unification especially when the entire graph does not fit into the GPU memory. With experiments conducted on a multi-server/multi-GPU cluster, we show that GraNNDis provides superior speedup over the state-of-the-art distributed GNN training frameworks.

View paper on

Share this with someone who'll enjoy it:

Title:GraNNDis: Efficient Unified Distributed Training Framework for Deep GNNs on Large Clusters

Paper and Code