Abstract:Graph Neural Networks (GNNs) extend the success of neural networks to graph-structured data by accounting for their intrinsic geometry. While extensive research has been done on developing GNN models with superior performance according to a collection of graph representation learning benchmarks, it is currently not well understood what aspects of a given model are probed by them. For example, to what extent do they test the ability of a model to leverage graph structure vs. node features? Here, we develop a principled approach to taxonomize benchmarking datasets according to a $\textit{sensitivity profile}$ that is based on how much GNN performance changes due to a collection of graph perturbations. Our data-driven analysis provides a deeper understanding of which benchmarking data characteristics are leveraged by GNNs. Consequently, our taxonomy can aid in selection and development of adequate graph benchmarks, and better informed evaluation of future GNN methods. Finally, our approach and implementation in $\texttt{GTaxoGym}$ package are extendable to multiple graph prediction task types and future datasets.
Abstract:Graph neural networks (GNNs) have attracted much attention due to their ability to leverage the intrinsic geometries of the underlying data. Although many different types of GNN models have been developed, with many benchmarking procedures to demonstrate the superiority of one GNN model over the others, there is a lack of systematic understanding of the underlying benchmarking datasets, and what aspects of the model are being tested. Here, we provide a principled approach to taxonomize graph benchmarking datasets by carefully designing a collection of graph perturbations to probe the essential data characteristics that GNN models leverage to perform predictions. Our data-driven taxonomization of graph datasets provides a new understanding of critical dataset characteristics that will enable better model evaluation and the development of more specialized GNN models.
Abstract:Graph neural networks (GNNs) have achieved remarkable success as a framework for deep learning on graph-structured data. However, GNNs are fundamentally limited by their tree-structured inductive bias: the WL-subtree kernel formulation bounds the representational capacity of GNNs, and polynomial-time GNNs are provably incapable of recognizing triangles in a graph. In this work, we propose to augment the GNN message-passing operations with information defined on ego graphs (i.e., the induced subgraph surrounding each node). We term these approaches Ego-GNNs and show that Ego-GNNs are provably more powerful than standard message-passing GNNs. In particular, we show that Ego-GNNs are capable of recognizing closed triangles, which is essential given the prominence of transitivity in real-world graphs. We also motivate our approach from the perspective of graph signal processing as a form of multiplex graph convolution. Experimental results on node classification using synthetic and real data highlight the achievable performance gains using this approach.