Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Marius++: Large-Scale Training of Graph Neural Networks on a Single Machine

Feb 04, 2022

Roger Waleffe, Jason Mohoney, Theodoros Rekatsinas, Shivaram Venkataraman

Figure 1 for Marius++: Large-Scale Training of Graph Neural Networks on a Single Machine

Figure 2 for Marius++: Large-Scale Training of Graph Neural Networks on a Single Machine

Figure 3 for Marius++: Large-Scale Training of Graph Neural Networks on a Single Machine

Figure 4 for Marius++: Large-Scale Training of Graph Neural Networks on a Single Machine

Share this with someone who'll enjoy it:

Abstract:Graph Neural Networks (GNNs) have emerged as a powerful model for ML over graph-structured data. Yet, scalability remains a major challenge for using GNNs over billion-edge inputs. The creation of mini-batches used for training incurs computational and data movement costs that grow exponentially with the number of GNN layers as state-of-the-art models aggregate information from the multi-hop neighborhood of each input node. In this paper, we focus on scalable training of GNNs with emphasis on resource efficiency. We show that out-of-core pipelined mini-batch training in a single machine outperforms resource-hungry multi-GPU solutions. We introduce Marius++, a system for training GNNs over billion-scale graphs. Marius++ provides disk-optimized training for GNNs and introduces a series of data organization and algorithmic contributions that 1) minimize the memory-footprint and end-to-end time required for training and 2) ensure that models learned with disk-based training exhibit accuracy similar to those fully trained in mixed CPU/GPU settings. We evaluate Marius++ against PyTorch Geometric and Deep Graph Library using seven benchmark (model, data set) settings and find that Marius++ with one GPU can achieve the same level of model accuracy up to 8$\times$ faster than these systems when they are using up to eight GPUs. For these experiments, disk-based training allows Marius++ deployments to be up to 64$\times$ cheaper in monetary cost than those of the competing systems.

View paper on

Share this with someone who'll enjoy it:

Title:Marius++: Large-Scale Training of Graph Neural Networks on a Single Machine

Paper and Code