Abstract:Graph Neural Networks (GNNs) have emerged as a promising tool to handle data exhibiting an irregular structure. However, most GNN architectures perform well on homophilic datasets, where the labels of neighboring nodes are likely to be the same. In recent years, an increasing body of work has been devoted to the development of GNN architectures for heterophilic datasets, where labels do not exhibit this low-pass behavior. In this work, we create a new graph in which nodes are connected if they share structural characteristics, meaning a higher chance of sharing their labels, and then use this new graph in the GNN architecture. To do this, we compute the k-nearest neighbors graph according to distances between structural features, which are either (i) role-based, such as degree, or (ii) global, such as centrality measures. Experiments show that the labels are smoother in this newly defined graph and that the performance of GNN architectures improves when using this alternative structure.
Abstract:Graph neural networks (GNNs) have become a workhorse approach for learning from data defined over irregular domains, typically by implicitly assuming that the data structure is represented by a homophilic graph. However, recent works have revealed that many relevant applications involve heterophilic data where the performance of GNNs can be notably compromised. To address this challenge, we present a simple yet effective architecture designed to mitigate the limitations of the homophily assumption. The proposed architecture reinterprets the role of graph filters in convolutional GNNs, resulting in a more general architecture while incorporating a stronger inductive bias than GNNs based on filter banks. The proposed convolutional layer enhances the expressive capacity of the architecture enabling it to learn from both homophilic and heterophilic data and preventing the issue of oversmoothing. From a theoretical standpoint, we show that the proposed architecture is permutation equivariant. Finally, we show that the proposed GNNs compares favorably relative to several state-of-the-art baselines in both homophilic and heterophilic datasets, showcasing its promising potential.
Abstract:Graph learning is the fundamental task of estimating unknown graph connectivity from available data. Typical approaches assume that not only is all information available simultaneously but also that all nodes can be observed. However, in many real-world scenarios, data can neither be known completely nor obtained all at once. We present a novel method for online graph estimation that accounts for the presence of hidden nodes. We consider signals that are stationary on the underlying graph, which provides a model for the unknown connections to hidden nodes. We then formulate a convex optimization problem for graph learning from streaming, incomplete graph signals. We solve the proposed problem through an efficient proximal gradient algorithm that can run in real-time as data arrives sequentially. Additionally, we provide theoretical conditions under which our online algorithm is similar to batch-wise solutions. Through experimental results on synthetic and real-world data, we demonstrate the viability of our approach for online graph learning in the presence of missing observations.
Abstract:This paper addresses the problem of online network topology inference for expanding graphs from a stream of spatiotemporal signals. Online algorithms for dynamic graph learning are crucial in delay-sensitive applications or when changes in topology occur rapidly. While existing works focus on inferring the connectivity within a fixed set of nodes, in practice, the graph can grow as new nodes join the network. This poses additional challenges like modeling temporal dynamics involving signals and graphs of different sizes. This growth also increases the computational complexity of the learning process, which may become prohibitive. To the best of our knowledge, this is the first work to tackle this setting. We propose a general online algorithm based on projected proximal gradient descent that accounts for the increasing graph size at each iteration. Recursively updating the sample covariance matrix is a key aspect of our approach. We introduce a strategy that enables different types of updates for nodes that just joined the network and for previously existing nodes. To provide further insights into the proposed method, we specialize it in Gaussian Markov random field settings, where we analyze the computational complexity and characterize the dynamic cumulative regret. Finally, we demonstrate the effectiveness of the proposed approach using both controlled experiments and real-world datasets from epidemic and financial networks.
Abstract:We address the problem of learning the topology of directed acyclic graphs (DAGs) from nodal observations, which adhere to a linear structural equation model. Recent advances framed the combinatorial DAG structure learning task as a continuous optimization problem, yet existing methods must contend with the complexities of non-convex optimization. To overcome this limitation, we assume that the latent DAG contains only non-negative edge weights. Leveraging this additional structure, we argue that cycles can be effectively characterized (and prevented) using a convex acyclicity function based on the log-determinant of the adjacency matrix. This convexity allows us to relax the task of learning the non-negative weighted DAG as an abstract convex optimization problem. We propose a DAG recovery algorithm based on the method of multipliers, that is guaranteed to return a global minimizer. Furthermore, we prove that in the infinite sample size regime, the convexity of our approach ensures the recovery of the true DAG structure. We empirically validate the performance of our algorithm in several reproducible synthetic-data test cases, showing that it outperforms state-of-the-art alternatives.
Abstract:We propose estimating Gaussian graphical models (GGMs) that are fair with respect to sensitive nodal attributes. Many real-world models exhibit unfair discriminatory behavior due to biases in data. Such discrimination is known to be exacerbated when data is equipped with pairwise relationships encoded in a graph. Additionally, the effect of biased data on graphical models is largely underexplored. We thus introduce fairness for graphical models in the form of two bias metrics to promote balance in statistical similarities across nodal groups with different sensitive attributes. Leveraging these metrics, we present Fair GLASSO, a regularized graphical lasso approach to obtain sparse Gaussian precision matrices with unbiased statistical dependencies across groups. We also propose an efficient proximal gradient algorithm to obtain the estimates. Theoretically, we express the tradeoff between fair and accurate estimated precision matrices. Critically, this includes demonstrating when accuracy can be preserved in the presence of a fairness regularizer. On top of this, we study the complexity of Fair GLASSO and demonstrate that our algorithm enjoys a fast convergence rate. Our empirical validation includes synthetic and real-world simulations that illustrate the value and effectiveness of our proposed optimization problem and iterative algorithm.
Abstract:We develop a novel convolutional architecture tailored for learning from data defined over directed acyclic graphs (DAGs). DAGs can be used to model causal relationships among variables, but their nilpotent adjacency matrices pose unique challenges towards developing DAG signal processing and machine learning tools. To address this limitation, we harness recent advances offering alternative definitions of causal shifts and convolutions for signals on DAGs. We develop a novel convolutional graph neural network that integrates learnable DAG filters to account for the partial ordering induced by the graph topology, thus providing valuable inductive bias to learn effective representations of DAG-supported data. We discuss the salient advantages and potential limitations of the proposed DAG convolutional network (DCN) and evaluate its performance on two learning tasks using synthetic data: network diffusion estimation and source identification. DCN compares favorably relative to several baselines, showcasing its promising potential.
Abstract:We consider fair network topology inference from nodal observations. Real-world networks often exhibit biased connections based on sensitive nodal attributes. Hence, different subpopulations of nodes may not share or receive information equitably. We thus propose an optimization-based approach to accurately infer networks while discouraging biased edges. To this end, we present bias metrics that measure topological demographic parity to be applied as convex penalties, suitable for most optimization-based graph learning methods. Moreover, we encourage equitable treatment for any number of subpopulations of differing sizes. We validate our method on synthetic and real-world simulations using networks with both biased and unbiased connections.
Abstract:Graph Neural Networks (GNNs) have emerged as a notorious alternative to address learning problems dealing with non-Euclidean datasets. However, although most works assume that the graph is perfectly known, the observed topology is prone to errors stemming from observational noise, graph-learning limitations, or adversarial attacks. If ignored, these perturbations may drastically hinder the performance of GNNs. To address this limitation, this work proposes a robust implementation of GNNs that explicitly accounts for the presence of perturbations in the observed topology. For any task involving GNNs, our core idea is to i) solve an optimization problem not only over the learnable parameters of the GNN but also over the true graph, and ii) augment the fitting cost with a term accounting for discrepancies on the graph. Specifically, we consider a convolutional GNN based on graph filters and follow an alternating optimization approach to handle the (non-differentiable and constrained) optimization problem by combining gradient descent and projected proximal updates. The resulting algorithm is not limited to a particular type of graph and is amenable to incorporating prior information about the perturbations. Finally, we assess the performance of the proposed method through several numerical experiments.
Abstract:Blind deconvolution over graphs involves using (observed) output graph signals to obtain both the inputs (sources) as well as the filter that drives (models) the graph diffusion process. This is an ill-posed problem that requires additional assumptions, such as the sources being sparse, to be solvable. This paper addresses the blind deconvolution problem in the presence of imperfect graph information, where the observed graph is a perturbed version of the (unknown) true graph. While not having perfect knowledge of the graph is arguably more the norm than the exception, the body of literature on this topic is relatively small. This is partly due to the fact that translating the uncertainty about the graph topology to standard graph signal processing tools (e.g. eigenvectors or polynomials of the graph) is a challenging endeavor. To address this limitation, we propose an optimization-based estimator that solves the blind identification in the vertex domain, aims at estimating the inverse of the generating filter, and accounts explicitly for additive graph perturbations. Preliminary numerical experiments showcase the effectiveness and potential of the proposed algorithm.