Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:QR-CLIP: Introducing Explicit Open-World Knowledge for Location and Time Reasoning

Feb 02, 2023

Weimin Shi, Mingchen Zhuge, Zhong Zhou, Dehong Gao, Deng-Ping Fan

Share this with someone who'll enjoy it:

Abstract:Daily images may convey abstract meanings that require us to memorize and infer profound information from them. To encourage such human-like reasoning, in this work, we teach machines to predict where and when it was taken rather than performing basic tasks like traditional segmentation or classification. Inspired by Horn's QR theory, we designed a novel QR-CLIP model consisting of two components: 1) the Quantity module first retrospects more open-world knowledge as the candidate language inputs; 2) the Relevance module carefully estimates vision and language cues and infers the location and time. Experiments show our QR-CLIP's effectiveness, and it outperforms the previous SOTA on each task by an average of about 10% and 130% relative lift in terms of location and time reasoning. This study lays a technical foundation for location and time reasoning and suggests that effectively introducing open-world knowledge is one of the panaceas for the tasks.

* On-Processing Work

View paper on

Share this with someone who'll enjoy it:

Title:QR-CLIP: Introducing Explicit Open-World Knowledge for Location and Time Reasoning

Paper and Code