Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daniel Stoll

An Experimental Comparison of Partitioning Strategies for Distributed Graph Neural Network Training

Aug 29, 2023

Nikolai Merkel, Daniel Stoll, Ruben Mayer, Hans-Arno Jacobsen

Figure 1 for An Experimental Comparison of Partitioning Strategies for Distributed Graph Neural Network Training

Figure 2 for An Experimental Comparison of Partitioning Strategies for Distributed Graph Neural Network Training

Figure 3 for An Experimental Comparison of Partitioning Strategies for Distributed Graph Neural Network Training

Figure 4 for An Experimental Comparison of Partitioning Strategies for Distributed Graph Neural Network Training

Abstract:Recently, graph neural networks (GNNs) have gained much attention as a growing area of deep learning capable of learning on graph-structured data. However, the computational and memory requirements for training GNNs on large-scale graphs can exceed the capabilities of single machines or GPUs, making distributed GNN training a promising direction for large-scale GNN training. A prerequisite for distributed GNN training is to partition the input graph into smaller parts that are distributed among multiple machines of a compute cluster. Although graph partitioning has been extensively studied with regard to graph analytics and graph databases, its effect on GNN training performance is largely unexplored. In this paper, we study the effectiveness of graph partitioning for distributed GNN training. Our study aims to understand how different factors such as GNN parameters, mini-batch size, graph type, features size, and scale-out factor influence the effectiveness of graph partitioning. We conduct experiments with two different GNN systems using vertex and edge partitioning. We found that graph partitioning is a crucial pre-processing step that can heavily reduce the training time and memory footprint. Furthermore, our results show that invested partitioning time can be amortized by reduced GNN training, making it a relevant optimization.

Via

Access Paper or Ask Questions