Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Margot Selosse

Self-Attention in Colors: Another Take on Encoding Graph Structure in Transformers

Apr 21, 2023

Romain Menegaux, Emmanuel Jehanno, Margot Selosse, Julien Mairal

Abstract:We introduce a novel self-attention mechanism, which we call CSA (Chromatic Self-Attention), which extends the notion of attention scores to attention _filters_, independently modulating the feature channels. We showcase CSA in a fully-attentional graph Transformer CGT (Chromatic Graph Transformer) which integrates both graph structural information and edge features, completely bypassing the need for local message-passing components. Our method flexibly encodes graph structure through node-node interactions, by enriching the original edge features with a relative positional encoding scheme. We propose a new scheme based on random walks that encodes both structural and positional information, and show how to incorporate higher-order topological information, such as rings in molecular graphs. Our approach achieves state-of-the-art results on the ZINC benchmark dataset, while providing a flexible framework for encoding graph structure and incorporating higher-order topology.

Via

Access Paper or Ask Questions

GraphiT: Encoding Graph Structure in Transformers

Jun 10, 2021

Grégoire Mialon, Dexiong Chen, Margot Selosse, Julien Mairal

Figure 1 for GraphiT: Encoding Graph Structure in Transformers

Figure 2 for GraphiT: Encoding Graph Structure in Transformers

Figure 3 for GraphiT: Encoding Graph Structure in Transformers

Figure 4 for GraphiT: Encoding Graph Structure in Transformers

Abstract:We show that viewing graphs as sets of node features and incorporating structural and positional information into a transformer architecture is able to outperform representations learned with classical graph neural networks (GNNs). Our model, GraphiT, encodes such information by (i) leveraging relative positional encoding strategies in self-attention scores based on positive definite kernels on graphs, and (ii) enumerating and encoding local sub-structures such as paths of short length. We thoroughly evaluate these two ideas on many classification and regression tasks, demonstrating the effectiveness of each of them independently, as well as their combination. In addition to performing well on standard benchmarks, our model also admits natural visualization mechanisms for interpreting graph motifs explaining the predictions, making it a potentially strong candidate for scientific applications where interpretation is important. Code available at https://github.com/inria-thoth/GraphiT.

Via

Access Paper or Ask Questions