Abstract:Graph Neural Networks (GNN) show great promise in problems dealing with graph-structured data. One of the unique points of GNNs is their flexibility to adapt to multiple problems, which not only leads to wide applicability, but also poses important challenges when finding the best model or acceleration technique for a particular problem. An example of such challenges resides in the fact that the accuracy or effectiveness of a GNN model or acceleration technique generally depends on the structure of the underlying graph. In this paper, in an attempt to address the problem of graph-dependent acceleration, we propose ProGNNosis, a data-driven model that can predict the GNN training time of a given GNN model running over a graph of arbitrary characteristics by inspecting the input graph metrics. Such prediction is made based on a regression that was previously trained offline using a diverse synthetic graph dataset. In practice, our method allows making informed decisions on which design to use for a specific problem. In the paper, the methodology to build ProGNNosis is defined and applied for a specific use case, where it helps to decide which graph representation is better. Our results show that ProGNNosis helps achieve an average speedup of 1.22X over randomly selecting a graph representation in multiple widely used GNN models such as GCN, GIN, GAT, or GraphSAGE.
Abstract:In general, to draw robust conclusions from a dataset, all the analyzed population must be represented on said dataset. Having a dataset that does not fulfill this condition normally leads to selection bias. Additionally, graphs have been used to model a wide variety of problems. Although synthetic graphs can be used to augment available real graph datasets to overcome selection bias, the generation of unbiased synthetic datasets is complex with current tools. In this work, we propose a method to find a synthetic graph dataset that has an even representation of graphs with different metrics. The resulting dataset can then be used, among others, for benchmarking graph processing techniques as the accuracy of different Graph Neural Network (GNN) models or the speedups obtained by different graph processing acceleration frameworks.