Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Soumyasundar Pal

Simplifying Graph Transformers

Apr 17, 2025

Liheng Ma, Soumyasundar Pal, Yingxue Zhang, Philip H. S. Torr, Mark Coates

Abstract:Transformers have attained outstanding performance across various modalities, employing scaled-dot-product (SDP) attention mechanisms. Researchers have attempted to migrate Transformers to graph learning, but most advanced Graph Transformers are designed with major architectural differences, either integrating message-passing or incorporating sophisticated attention mechanisms. These complexities prevent the easy adoption of Transformer training advances. We propose three simple modifications to the plain Transformer to render it applicable to graphs without introducing major architectural distortions. Specifically, we advocate for the use of (1) simplified $L_2$ attention to measure the magnitude closeness of tokens; (2) adaptive root-mean-square normalization to preserve token magnitude information; and (3) a relative positional encoding bias with a shared encoder. Significant performance gains across a variety of graph datasets justify the effectiveness of our proposed modifications. Furthermore, empirical evaluation on the expressiveness benchmark reveals noteworthy realized expressiveness in the graph isomorphism.

Via

Access Paper or Ask Questions

Hint Marginalization for Improved Reasoning in Large Language Models

Dec 17, 2024

Soumyasundar Pal, Didier Chételat, Yingxue Zhang, Mark Coates

Abstract:Large Language Models (LLMs) have exhibited an impressive capability to perform reasoning tasks, especially if they are encouraged to generate a sequence of intermediate steps. Reasoning performance can be improved by suitably combining multiple LLM responses, generated either in parallel in a single query, or via sequential interactions with LLMs throughout the reasoning process. Existing strategies for combination, such as self-consistency and progressive-hint-prompting, make inefficient usage of the LLM responses. We present Hint Marginalization, a novel and principled algorithmic framework to enhance the reasoning capabilities of LLMs. Our approach can be viewed as an iterative sampling strategy for forming a Monte Carlo approximation of an underlying distribution of answers, with the goal of identifying the mode the most likely answer. Empirical evaluation on several benchmark datasets for arithmetic reasoning demonstrates the superiority of the proposed approach.

Via

Access Paper or Ask Questions

CKGConv: General Graph Convolution with Continuous Kernels

Apr 21, 2024

Liheng Ma, Soumyasundar Pal, Yitian Zhang, Jiaming Zhou, Yingxue Zhang, Mark Coates

Abstract:The existing definitions of graph convolution, either from spatial or spectral perspectives, are inflexible and not unified. Defining a general convolution operator in the graph domain is challenging due to the lack of canonical coordinates, the presence of irregular structures, and the properties of graph symmetries. In this work, we propose a novel graph convolution framework by parameterizing the kernels as continuous functions of pseudo-coordinates derived via graph positional encoding. We name this Continuous Kernel Graph Convolution (CKGConv). Theoretically, we demonstrate that CKGConv is flexible and expressive. CKGConv encompasses many existing graph convolutions, and exhibits the same expressiveness as graph transformers in terms of distinguishing non-isomorphic graphs. Empirically, we show that CKGConv-based Networks outperform existing graph convolutional networks and perform comparably to the best graph transformers across a variety of graph datasets.

Via

Access Paper or Ask Questions

Multi-resolution Time-Series Transformer for Long-term Forecasting

Nov 07, 2023

Yitian Zhang, Liheng Ma, Soumyasundar Pal, Yingxue Zhang, Mark Coates

Abstract:The performance of transformers for time-series forecasting has improved significantly. Recent architectures learn complex temporal patterns by segmenting a time-series into patches and using the patches as tokens. The patch size controls the ability of transformers to learn the temporal patterns at different frequencies: shorter patches are effective for learning localized, high-frequency patterns, whereas mining long-term seasonalities and trends requires longer patches. Inspired by this observation, we propose a novel framework, Multi-resolution Time-Series Transformer (MTST), which consists of a multi-branch architecture for simultaneous modeling of diverse temporal patterns at different resolutions. In contrast to many existing time-series transformers, we employ relative positional encoding, which is better suited for extracting periodic components at different scales. Extensive experiments on several real-world datasets demonstrate the effectiveness of MTST in comparison to state-of-the-art forecasting techniques.

Via

Access Paper or Ask Questions

Node Copying: A Random Graph Model for Effective Graph Sampling

Aug 04, 2022

Florence Regol, Soumyasundar Pal, Jianing Sun, Yingxue Zhang, Yanhui Geng, Mark Coates

Figure 1 for Node Copying: A Random Graph Model for Effective Graph Sampling

Figure 2 for Node Copying: A Random Graph Model for Effective Graph Sampling

Figure 3 for Node Copying: A Random Graph Model for Effective Graph Sampling

Figure 4 for Node Copying: A Random Graph Model for Effective Graph Sampling

Abstract:There has been an increased interest in applying machine learning techniques on relational structured-data based on an observed graph. Often, this graph is not fully representative of the true relationship amongst nodes. In these settings, building a generative model conditioned on the observed graph allows to take the graph uncertainty into account. Various existing techniques either rely on restrictive assumptions, fail to preserve topological properties within the samples or are prohibitively expensive for larger graphs. In this work, we introduce the node copying model for constructing a distribution over graphs. Sampling of a random graph is carried out by replacing each node's neighbors by those of a randomly sampled similar node. The sampled graphs preserve key characteristics of the graph structure without explicitly targeting them. Additionally, sampling from this model is extremely simple and scales linearly with the nodes. We show the usefulness of the copying model in three tasks. First, in node classification, a Bayesian formulation based on node copying achieves higher accuracy in sparse data settings. Second, we employ our proposed model to mitigate the effect of adversarial attacks on the graph topology. Last, incorporation of the model in a recommendation system setting improves recall over state-of-the-art methods.

* Signal Processing, Volume 192, March 2022, 108335

Via

Access Paper or Ask Questions

Bag Graph: Multiple Instance Learning using Bayesian Graph Neural Networks

Feb 22, 2022

Soumyasundar Pal, Antonios Valkanas, Florence Regol, Mark Coates

Figure 1 for Bag Graph: Multiple Instance Learning using Bayesian Graph Neural Networks

Figure 2 for Bag Graph: Multiple Instance Learning using Bayesian Graph Neural Networks

Figure 3 for Bag Graph: Multiple Instance Learning using Bayesian Graph Neural Networks

Figure 4 for Bag Graph: Multiple Instance Learning using Bayesian Graph Neural Networks

Abstract:Multiple Instance Learning (MIL) is a weakly supervised learning problem where the aim is to assign labels to sets or bags of instances, as opposed to traditional supervised learning where each instance is assumed to be independent and identically distributed (IID) and is to be labeled individually. Recent work has shown promising results for neural network models in the MIL setting. Instead of focusing on each instance, these models are trained in an end-to-end fashion to learn effective bag-level representations by suitably combining permutation invariant pooling techniques with neural architectures. In this paper, we consider modelling the interactions between bags using a graph and employ Graph Neural Networks (GNNs) to facilitate end-to-end learning. Since a meaningful graph representing dependencies between bags is rarely available, we propose to use a Bayesian GNN framework that can generate a likely graph structure for scenarios where there is uncertainty in the graph or when no graph is available. Empirical results demonstrate the efficacy of the proposed technique for several MIL benchmark tasks and a distribution regression task.

Via

Access Paper or Ask Questions

RNN with Particle Flow for Probabilistic Spatio-temporal Forecasting

Jun 10, 2021

Soumyasundar Pal, Liheng Ma, Yingxue Zhang, Mark Coates

Figure 1 for RNN with Particle Flow for Probabilistic Spatio-temporal Forecasting

Figure 2 for RNN with Particle Flow for Probabilistic Spatio-temporal Forecasting

Figure 3 for RNN with Particle Flow for Probabilistic Spatio-temporal Forecasting

Figure 4 for RNN with Particle Flow for Probabilistic Spatio-temporal Forecasting

Abstract:Spatio-temporal forecasting has numerous applications in analyzing wireless, traffic, and financial networks. Many classical statistical models often fall short in handling the complexity and high non-linearity present in time-series data. Recent advances in deep learning allow for better modelling of spatial and temporal dependencies. While most of these models focus on obtaining accurate point forecasts, they do not characterize the prediction uncertainty. In this work, we consider the time-series data as a random realization from a nonlinear state-space model and target Bayesian inference of the hidden states for probabilistic forecasting. We use particle flow as the tool for approximating the posterior distribution of the states, as it is shown to be highly effective in complex, high-dimensional settings. Thorough experimentation on several real world time-series datasets demonstrates that our approach provides better characterization of uncertainty while maintaining comparable accuracy to the state-of-the art point forecasting methods.

* ICML 2021

Via

Access Paper or Ask Questions

Node Copying for Protection Against Graph Neural Network Topology Attacks

Jul 09, 2020

Florence Regol, Soumyasundar Pal, Mark Coates

Figure 1 for Node Copying for Protection Against Graph Neural Network Topology Attacks

Figure 2 for Node Copying for Protection Against Graph Neural Network Topology Attacks

Figure 3 for Node Copying for Protection Against Graph Neural Network Topology Attacks

Figure 4 for Node Copying for Protection Against Graph Neural Network Topology Attacks

Abstract:Adversarial attacks can affect the performance of existing deep learning models. With the increased interest in graph based machine learning techniques, there have been investigations which suggest that these models are also vulnerable to attacks. In particular, corruptions of the graph topology can degrade the performance of graph based learning algorithms severely. This is due to the fact that the prediction capability of these algorithms relies mostly on the similarity structure imposed by the graph connectivity. Therefore, detecting the location of the corruption and correcting the induced errors becomes crucial. There has been some recent work which tackles the detection problem, however these methods do not address the effect of the attack on the downstream learning task. In this work, we propose an algorithm that uses node copying to mitigate the degradation in classification that is caused by adversarial attacks. The proposed methodology is applied only after the model for the downstream task is trained and the added computation cost scales well for large graphs. Experimental results show the effectiveness of our approach for several real world datasets.

Via

Access Paper or Ask Questions

Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation

Jul 09, 2020

Florence Regol, Soumyasundar Pal, Yingxue Zhang, Mark Coates

Figure 1 for Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation

Figure 2 for Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation

Figure 3 for Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation

Figure 4 for Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation

Abstract:Node classification in attributed graphs is an important task in multiple practical settings, but it can often be difficult or expensive to obtain labels. Active learning can improve the achieved classification performance for a given budget on the number of queried labels. The best existing methods are based on graph neural networks, but they often perform poorly unless a sizeable validation set of labelled nodes is available in order to choose good hyperparameters. We propose a novel graph-based active learning algorithm for the task of node classification in attributed graphs; our algorithm uses graph cognizant logistic regression, equivalent to a linearized graph convolutional neural network (GCN), for the prediction phase and maximizes the expected error reduction in the query phase. To reduce the delay experienced by a labeller interacting with the system, we derive a preemptive querying system that calculates a new query during the labelling process, and to address the setting where learning starts with almost no labelled data, we also develop a hybrid algorithm that performs adaptive model averaging of label propagation and linearized GCN inference. We conduct experiments on five public benchmark datasets, demonstrating a significant improvement over state-of-the-art approaches and illustrate the practical value of the method by applying it to a private microwave link network dataset.

Via

Access Paper or Ask Questions

Non-Parametric Graph Learning for Bayesian Graph Neural Networks

Jun 23, 2020

Soumyasundar Pal, Saber Malekmohammadi, Florence Regol, Yingxue Zhang, Yishi Xu, Mark Coates

Figure 1 for Non-Parametric Graph Learning for Bayesian Graph Neural Networks

Figure 2 for Non-Parametric Graph Learning for Bayesian Graph Neural Networks

Figure 3 for Non-Parametric Graph Learning for Bayesian Graph Neural Networks

Figure 4 for Non-Parametric Graph Learning for Bayesian Graph Neural Networks

Abstract:Graphs are ubiquitous in modelling relational structures. Recent endeavours in machine learning for graph-structured data have led to many architectures and learning algorithms. However, the graph used by these algorithms is often constructed based on inaccurate modelling assumptions and/or noisy data. As a result, it fails to represent the true relationships between nodes. A Bayesian framework which targets posterior inference of the graph by considering it as a random quantity can be beneficial. In this paper, we propose a novel non-parametric graph model for constructing the posterior distribution of graph adjacency matrices. The proposed model is flexible in the sense that it can effectively take into account the output of graph-based learning algorithms that target specific tasks. In addition, model inference scales well to large graphs. We demonstrate the advantages of this model in three different problem settings: node classification, link prediction and recommendation.

Via

Access Paper or Ask Questions