Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Maha Farhat

Geometric multimodal representation learning

Sep 07, 2022

Yasha Ektefaie, George Dasoulas, Ayush Noori, Maha Farhat, Marinka Zitnik

Figure 1 for Geometric multimodal representation learning

Figure 2 for Geometric multimodal representation learning

Figure 3 for Geometric multimodal representation learning

Figure 4 for Geometric multimodal representation learning

Abstract:Graph-centric artificial intelligence (graph AI) has achieved remarkable success in modeling interacting systems prevalent in nature, from dynamical systems in biology to particle physics. The increasing heterogeneity of data calls for graph neural architectures that can combine multiple inductive biases. However, combining data from various sources is challenging because appropriate inductive bias may vary by data modality. Multimodal learning methods fuse multiple data modalities while leveraging cross-modal dependencies to address this challenge. Here, we survey 140 studies in graph-centric AI and realize that diverse data types are increasingly brought together using graphs and fed into sophisticated multimodal models. These models stratify into image-, language-, and knowledge-grounded multimodal learning. We put forward an algorithmic blueprint for multimodal graph learning based on this categorization. The blueprint serves as a way to group state-of-the-art architectures that treat multimodal data by choosing appropriately four different components. This effort can pave the way for standardizing the design of sophisticated multimodal architectures for highly complex real-world problems.

* 28 pages, 5 figures, 2 boxes

Via

Access Paper or Ask Questions