Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Question-Driven Graph Fusion Network For Visual Question Answering

Apr 03, 2022

Yuxi Qian, Yuncong Hu, Ruonan Wang, Fangxiang Feng, Xiaojie Wang

Figure 1 for Question-Driven Graph Fusion Network For Visual Question Answering

Figure 2 for Question-Driven Graph Fusion Network For Visual Question Answering

Figure 3 for Question-Driven Graph Fusion Network For Visual Question Answering

Figure 4 for Question-Driven Graph Fusion Network For Visual Question Answering

Share this with someone who'll enjoy it:

Abstract:Existing Visual Question Answering (VQA) models have explored various visual relationships between objects in the image to answer complex questions, which inevitably introduces irrelevant information brought by inaccurate object detection and text grounding. To address the problem, we propose a Question-Driven Graph Fusion Network (QD-GFN). It first models semantic, spatial, and implicit visual relations in images by three graph attention networks, then question information is utilized to guide the aggregation process of the three graphs, further, our QD-GFN adopts an object filtering mechanism to remove question-irrelevant objects contained in the image. Experiment results demonstrate that our QD-GFN outperforms the prior state-of-the-art on both VQA 2.0 and VQA-CP v2 datasets. Further analysis shows that both the novel graph aggregation method and object filtering mechanism play a significant role in improving the performance of the model.

* Accepted by ICME 2022

View paper on

Share this with someone who'll enjoy it:

Title:Question-Driven Graph Fusion Network For Visual Question Answering

Paper and Code