Abstract:In this paper, we study the generalization capabilities of geometric graph neural networks (GNNs). We consider GNNs over a geometric graph constructed from a finite set of randomly sampled points over an embedded manifold with topological information captured. We prove a generalization gap between the optimal empirical risk and the optimal statistical risk of this GNN, which decreases with the number of sampled points from the manifold and increases with the dimension of the underlying manifold. This generalization gap ensures that the GNN trained on a graph on a set of sampled points can be utilized to process other unseen graphs constructed from the same underlying manifold. The most important observation is that the generalization capability can be realized with one large graph instead of being limited to the size of the graph as in previous results. The generalization gap is derived based on the non-asymptotic convergence result of a GNN on the sampled graph to the underlying manifold neural networks (MNNs). We verify this theoretical result with experiments on both Arxiv dataset and Cora dataset.
Abstract:Graph neural networks (GNNs) have demonstrated their effectiveness in various tasks supported by their generalization capabilities. However, the current analysis of GNN generalization relies on the assumption that training and testing data are independent and identically distributed (i.i.d). This imposes limitations on the cases where a model mismatch exists when generating testing data. In this paper, we examine GNNs that operate on geometric graphs generated from manifold models, explicitly focusing on scenarios where there is a mismatch between manifold models generating training and testing data. Our analysis reveals the robustness of the GNN generalization in the presence of such model mismatch. This indicates that GNNs trained on graphs generated from a manifold can still generalize well to unseen nodes and graphs generated from a mismatched manifold. We attribute this mismatch to both node feature perturbations and edge perturbations within the generated graph. Our findings indicate that the generalization gap decreases as the number of nodes grows in the training graph while increasing with larger manifold dimension as well as larger mismatch. Importantly, we observe a trade-off between the generalization of GNNs and the capability to discriminate high-frequency components when facing a model mismatch. The most important practical consequence of this analysis is to shed light on the filter design of generalizable GNNs robust to model mismatch. We verify our theoretical findings with experiments on multiple real-world datasets.
Abstract:Convolutional neural networks have been successfully extended to operate on graphs, giving rise to Graph Neural Networks (GNNs). GNNs combine information from adjacent nodes by successive applications of graph convolutions. GNNs have been implemented successfully in various learning tasks while the theoretical understanding of their generalization capability is still in progress. In this paper, we leverage manifold theory to analyze the statistical generalization gap of GNNs operating on graphs constructed on sampled points from manifolds. We study the generalization gaps of GNNs on both node-level and graph-level tasks. We show that the generalization gaps decrease with the number of nodes in the training graphs, which guarantees the generalization of GNNs to unseen points over manifolds. We validate our theoretical results in multiple real-world datasets.
Abstract:This paper studies the relationship between a graph neural network (GNN) and a manifold neural network (MNN) when the graph is constructed from a set of points sampled from the manifold, thus encoding geometric information. We consider convolutional MNNs and GNNs where the manifold and the graph convolutions are respectively defined in terms of the Laplace-Beltrami operator and the graph Laplacian. Using the appropriate kernels, we analyze both dense and moderately sparse graphs. We prove non-asymptotic error bounds showing that convolutional filters and neural networks on these graphs converge to convolutional filters and neural networks on the continuous manifold. As a byproduct of this analysis, we observe an important trade-off between the discriminability of graph filters and their ability to approximate the desired behavior of manifold filters. We then discuss how this trade-off is ameliorated in neural networks due to the frequency mixing property of nonlinearities. We further derive a transferability corollary for geometric graphs sampled from the same manifold. We validate our results numerically on a navigation control problem and a point cloud classification task.
Abstract:In this work we introduce a convolution operation over the tangent bundle of Riemann manifolds in terms of exponentials of the Connection Laplacian operator. We define tangent bundle filters and tangent bundle neural networks (TNNs) based on this convolution operation, which are novel continuous architectures operating on tangent bundle signals, i.e. vector fields over the manifolds. Tangent bundle filters admit a spectral representation that generalizes the ones of scalar manifold filters, graph filters and standard convolutional filters in continuous time. We then introduce a discretization procedure, both in the space and time domains, to make TNNs implementable, showing that their discrete counterpart is a novel principled variant of the very recently introduced sheaf neural networks. We formally prove that this discretized architecture converges to the underlying continuous TNN. Finally, we numerically evaluate the effectiveness of the proposed architecture on various learning tasks, both on synthetic and real data.
Abstract:In this paper we propose a pooling approach for convolutional information processing on graphs relying on the theory of graphons and limits of dense graph sequences. We present three methods that exploit the induced graphon representation of graphs and graph signals on partitions of [0, 1]2 in the graphon space. As a result we derive low dimensional representations of the convolutional operators, while a dimensionality reduction of the signals is achieved by simple local interpolation of functions in L2([0, 1]). We prove that those low dimensional representations constitute a convergent sequence of graphs and graph signals, respectively. The methods proposed and the theoretical guarantees that we provide show that the reduced graphs and signals inherit spectral-structural properties of the original quantities. We evaluate our approach with a set of numerical experiments performed on graph neural networks (GNNs) that rely on graphon pooling. We observe that graphon pooling performs significantly better than other approaches proposed in the literature when dimensionality reduction ratios between layers are large. We also observe that when graphon pooling is used we have, in general, less overfitting and lower computational cost.
Abstract:The increasing availability of geometric data has motivated the need for information processing over non-Euclidean domains modeled as manifolds. The building block for information processing architectures with desirable theoretical properties such as invariance and stability is convolutional filtering. Manifold convolutional filters are defined from the manifold diffusion sequence, constructed by successive applications of the Laplace-Beltrami operator to manifold signals. However, the continuous manifold model can only be accessed by sampling discrete points and building an approximate graph model from the sampled manifold. Effective linear information processing on the manifold requires quantifying the error incurred when approximating manifold convolutions with graph convolutions. In this paper, we derive a non-asymptotic error bound for this approximation, showing that convolutional filtering on the sampled manifold converges to continuous manifold filtering. Our findings are further demonstrated empirically on a problem of navigation control.
Abstract:In this work we introduce a convolution operation over the tangent bundle of Riemannian manifolds exploiting the Connection Laplacian operator. We use the convolution to define tangent bundle filters and tangent bundle neural networks (TNNs), novel continuous architectures operating on tangent bundle signals, i.e. vector fields over manifolds. We discretize TNNs both in space and time domains, showing that their discrete counterpart is a principled variant of the recently introduced Sheaf Neural Networks. We formally prove that this discrete architecture converges to the underlying continuous TNN. We numerically evaluate the effectiveness of the proposed architecture on a denoising task of a tangent vector field over the unit 2-sphere.
Abstract:Geometric deep learning has gained much attention in recent years due to more available data acquired from non-Euclidean domains. Some examples include point clouds for 3D models and wireless sensor networks in communications. Graphs are common models to connect these discrete data points and capture the underlying geometric structure. With the large amount of these geometric data, graphs with arbitrarily large size tend to converge to a limit model -- the manifold. Deep neural network architectures have been proved as a powerful technique to solve problems based on these data residing on the manifold. In this paper, we propose a manifold neural network (MNN) composed of a bank of manifold convolutional filters and point-wise nonlinearities. We define a manifold convolution operation which is consistent with the discrete graph convolution by discretizing in both space and time domains. To sum up, we focus on the manifold model as the limit of large graphs and construct MNNs, while we can still bring back graph neural networks by the discretization of MNNs. We carry out experiments based on point-cloud dataset to showcase the performance of our proposed MNNs.
Abstract:In this paper we study the stability properties of aggregation graph neural networks (Agg-GNNs) considering perturbations of the underlying graph. An Agg-GNN is a hybrid architecture where information is defined on the nodes of a graph, but it is processed block-wise by Euclidean CNNs on the nodes after several diffusions on the graph shift operator. We derive stability bounds for the mapping operator associated to a generic Agg-GNN, and we specify conditions under which such operators can be stable to deformations. We prove that the stability bounds are defined by the properties of the filters in the first layer of the CNN that acts on each node. Additionally, we show that there is a close relationship between the number of aggregations, the filter's selectivity, and the size of the stability constants. We also conclude that in Agg-GNNs the selectivity of the mapping operators is tied to the properties of the filters only in the first layer of the CNN stage. This shows a substantial difference with respect to the stability properties of selection GNNs, where the selectivity of the filters in all layers is constrained by their stability. We provide numerical evidence corroborating the results derived, testing the behavior of Agg-GNNs in real life application scenarios considering perturbations of different magnitude.