Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Look, Remember and Reason: Visual Reasoning with Grounded Rationales

Jun 30, 2023

Apratim Bhattacharyya, Sunny Panchal, Mingu Lee, Reza Pourreza, Pulkit Madan, Roland Memisevic

Figure 1 for Look, Remember and Reason: Visual Reasoning with Grounded Rationales

Figure 2 for Look, Remember and Reason: Visual Reasoning with Grounded Rationales

Figure 3 for Look, Remember and Reason: Visual Reasoning with Grounded Rationales

Figure 4 for Look, Remember and Reason: Visual Reasoning with Grounded Rationales

Share this with someone who'll enjoy it:

Abstract:Large language models have recently shown human level performance on a variety of reasoning tasks. However, the ability of these models to perform complex visual reasoning has not been studied in detail yet. A key challenge in many visual reasoning tasks is that the visual information needs to be tightly integrated in the reasoning process. We propose to address this challenge by drawing inspiration from human visual problem solving which depends on a variety of low-level visual capabilities. It can often be cast as the three step-process of ``Look, Remember, Reason'': visual information is incrementally extracted using low-level visual routines in a step-by-step fashion until a final answer is reached. We follow the same paradigm to enable existing large language models, with minimal changes to the architecture, to solve visual reasoning problems. To this end, we introduce rationales over the visual input that allow us to integrate low-level visual capabilities, such as object recognition and tracking, as surrogate tasks. We show competitive performance on diverse visual reasoning tasks from the CLEVR, CATER, and ACRE datasets over state-of-the-art models designed specifically for these tasks.

View paper on

Share this with someone who'll enjoy it:

Title:Look, Remember and Reason: Visual Reasoning with Grounded Rationales

Paper and Code