Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Interpretable Visual Understanding with Cognitive Attention Network

Aug 14, 2021

Xuejiao Tang, Wenbin Zhang, Yi Yu, Kea Turner, Tyler Derr, Mengyu Wang, Eirini Ntoutsi

Figure 1 for Interpretable Visual Understanding with Cognitive Attention Network

Figure 2 for Interpretable Visual Understanding with Cognitive Attention Network

Figure 3 for Interpretable Visual Understanding with Cognitive Attention Network

Figure 4 for Interpretable Visual Understanding with Cognitive Attention Network

Share this with someone who'll enjoy it:

Abstract:While image understanding on recognition-level has achieved remarkable advancements, reliable visual scene understanding requires comprehensive image understanding on recognition-level but also cognition-level, which calls for exploiting the multi-source information as well as learning different levels of understanding and extensive commonsense knowledge. In this paper, we propose a novel Cognitive Attention Network (CAN) for visual commonsense reasoning to achieve interpretable visual understanding. Specifically, we first introduce an image-text fusion module to fuse information from images and text collectively. Second, a novel inference module is designed to encode commonsense among image, query and response. Extensive experiments on large-scale Visual Commonsense Reasoning (VCR) benchmark dataset demonstrate the effectiveness of our approach. The implementation is publicly available at https://github.com/tanjatang/CAN

* ICANN21

View paper on

Share this with someone who'll enjoy it:

Title:Interpretable Visual Understanding with Cognitive Attention Network

Paper and Code