Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering

Dec 13, 2018

Gao Peng, Hongsheng Li, Haoxuan You, Zhengkai Jiang, Pan Lu, Steven Hoi, Xiaogang Wang

Figure 1 for Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering

Figure 2 for Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering

Figure 3 for Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering

Figure 4 for Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering

Share this with someone who'll enjoy it:

Abstract:Learning effective fusion of multi-modality features is at the heart of visual question answering. We propose a novel method of dynamically fusing multi-modal features with intra- and inter-modality information flow, which alternatively pass dynamic information between and across the visual and language modalities. It can robustly capture the high-level interactions between language and vision domains, thus significantly improves the performance of visual question answering. We also show that the proposed dynamic intra-modality attention flow conditioned on the other modality can dynamically modulate the intra-modality attention of the target modality, which is vital for multimodality feature fusion. Experimental evaluations on the VQA 2.0 dataset show that the proposed method achieves state-of-the-art VQA performance. Extensive ablation studies are carried out for the comprehensive analysis of the proposed method.

* report

View paper on

Share this with someone who'll enjoy it:

Title:Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering

Paper and Code