Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:DF-GNN: Dynamic Fusion Framework for Attention Graph Neural Networks on GPUs

Nov 25, 2024

Jiahui Liu, Zhenkun Cai, Zhiyong Chen, Minjie Wang

Figure 1 for DF-GNN: Dynamic Fusion Framework for Attention Graph Neural Networks on GPUs

Figure 2 for DF-GNN: Dynamic Fusion Framework for Attention Graph Neural Networks on GPUs

Figure 3 for DF-GNN: Dynamic Fusion Framework for Attention Graph Neural Networks on GPUs

Figure 4 for DF-GNN: Dynamic Fusion Framework for Attention Graph Neural Networks on GPUs

Share this with someone who'll enjoy it:

Abstract:Attention Graph Neural Networks (AT-GNNs), such as GAT and Graph Transformer, have demonstrated superior performance compared to other GNNs. However, existing GNN systems struggle to efficiently train AT-GNNs on GPUs due to their intricate computation patterns. The execution of AT-GNN operations without kernel fusion results in heavy data movement and significant kernel launch overhead, while fixed thread scheduling in existing GNN kernel fusion strategies leads to sub-optimal performance, redundant computation and unbalanced workload. To address these challenges, we propose a dynamic kernel fusion framework, DF-GNN, for the AT-GNN family. DF-GNN introduces a dynamic bi-level thread scheduling strategy, enabling flexible adjustments to thread scheduling while retaining the benefits of shared memory within the fused kernel. DF-GNN tailors specific thread scheduling for operations in AT-GNNs and considers the performance bottleneck shift caused by the presence of super nodes. Additionally, DF-GNN is integrated with the PyTorch framework for high programmability. Evaluations across diverse GNN models and multiple datasets reveal that DF-GNN surpasses existing GNN kernel optimization works like cuGraph and dgNN, with speedups up to $7.0\times$ over the state-of-the-art non-fusion DGL sparse library. Moreover, it achieves an average speedup of $2.16\times$ in end-to-end training compared to the popular GNN computing framework DGL.

View paper on

Share this with someone who'll enjoy it:

Title:DF-GNN: Dynamic Fusion Framework for Attention Graph Neural Networks on GPUs

Paper and Code