Abstract:Graph homomorphism counts, first explored by Lov\'asz in 1967, have recently garnered interest as a powerful tool in graph-based machine learning. Grohe (PODS 2020) proposed the theoretical foundations for using homomorphism counts in machine learning on graph level as well as node level tasks. By their very nature, these capture local structural information, which enables the creation of robust structural embeddings. While a first approach for graph level tasks has been made by Nguyen and Maehara (ICML 2020), we experimentally show the effectiveness of homomorphism count based node embeddings. Enriched with node labels, node weights, and edge weights, these offer an interpretable representation of graph data, allowing for enhanced explainability of machine learning models. We propose a theoretical framework for isomorphism-invariant homomorphism count based embeddings which lend themselves to a wide variety of downstream tasks. Our approach capitalises on the efficient computability of graph homomorphism counts for bounded treewidth graph classes, rendering it a practical solution for real-world applications. We demonstrate their expressivity through experiments on benchmark datasets. Although our results do not match the accuracy of state-of-the-art neural architectures, they are comparable to other advanced graph learning models. Remarkably, our approach demarcates itself by ensuring explainability for each individual feature. By integrating interpretable machine learning algorithms like SVMs or Random Forests, we establish a seamless, end-to-end explainable pipeline. Our study contributes to the advancement of graph-based techniques that offer both performance and interpretability.
Abstract:In this paper we propose a graph neural network architecture solving the AC power flow problem under realistic constraints. While the energy transition is changing the energy industry to a digitalized and decentralized energy system, the challenges are increasingly shifting to the distribution grid level to integrate new loads and generation technologies. To ensure a save and resilient operation of distribution grids, AC power flow calculations are the means of choice to determine grid operating limits or analyze grid asset utilization in planning procedures. In our approach we demonstrate the development of a framework which makes use of graph neural networks to learn the physical constraints of the power flow. We present our model architecture on which we perform unsupervised training to learn a general solution of the AC power flow formulation that is independent of the specific topologies and supply tasks used for training. Finally, we demonstrate, validate and discuss our results on medium voltage benchmark grids.
Abstract:We propose CRaWl (CNNs for Random Walks), a novel neural network architecture for graph learning. It is based on processing sequences of small subgraphs induced by random walks with standard 1D CNNs. Thus, CRaWl is fundamentally different from typical message passing graph neural network architectures. It is inspired by techniques counting small subgraphs, such as the graphlet kernel and motif counting, and combines them with random walk based techniques in a highly efficient and scalable neural architecture. We demonstrate empirically that CRaWl matches or outperforms state-of-the-art GNN architectures across a multitude of benchmark datasets for graph learning.
Abstract:We systematically evaluate the (in-)stability of state-of-the-art node embedding algorithms due to randomness, i.e., the random variation of their outcomes given identical algorithms and graphs. We apply five node embeddings algorithms---HOPE, LINE, node2vec, SDNE, and GraphSAGE---to synthetic and empirical graphs and assess their stability under randomness with respect to (i) the geometry of embedding spaces as well as (ii) their performance in downstream tasks. We find significant instabilities in the geometry of embedding spaces independent of the centrality of a node. In the evaluation of downstream tasks, we find that the accuracy of node classification seems to be unaffected by random seeding while the actual classification of nodes can vary significantly. This suggests that instability effects need to be taken into account when working with node embeddings. Our work is relevant for researchers and engineers interested in the effectiveness, reliability, and reproducibility of node embedding approaches.
Abstract:Constraint satisfaction problems form an important and wide class of combinatorial search and optimization problems with many applications in AI and other areas. We introduce a recurrent neural network architecture RUN-CSP (Recurrent Unsupervised Neural Network for Constraint Satisfaction Problems) to train message passing networks solving binary constraint satisfaction problems (CSPs) or their optimization versions (Max-CSP). The architecture is universal in the sense that it works for all binary CSPs: depending on the constraint language, we can automtically design a loss function, which is then used to train generic neural nets. In this paper, we experimentally evaluate our approach for the 3-colorability problem (3-Col) and its optimization version (Max-3-Col) and for the maximum 2-satisfiability problem (Max-2-Sat). We also extend the framework to work for related optimization problems such as the maximum independent set problem (Max-IS). Training is unsupervised, we train the network on arbitrary (unlabeled) instances of the problems. Moreover, we experimentally show that it suffices to train on relatively small instances; the resulting message passing network will perform well on much larger instances (at least 10-times larger).