Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering

May 30, 2022

Jiangtong Li, Li Niu, Liqing Zhang

Figure 1 for From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering

Figure 2 for From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering

Figure 3 for From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering

Figure 4 for From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering

Share this with someone who'll enjoy it:

Abstract:Video understanding has achieved great success in representation learning, such as video caption, video object grounding, and video descriptive question-answer. However, current methods still struggle on video reasoning, including evidence reasoning and commonsense reasoning. To facilitate deeper video understanding towards video reasoning, we present the task of Causal-VidQA, which includes four types of questions ranging from scene description (description) to evidence reasoning (explanation) and commonsense reasoning (prediction and counterfactual). For commonsense reasoning, we set up a two-step solution by answering the question and providing a proper reason. Through extensive experiments on existing VideoQA methods, we find that the state-of-the-art methods are strong in descriptions but weak in reasoning. We hope that Causal-VidQA can guide the research of video understanding from representation learning to deeper reasoning. The dataset and related resources are available at \url{https://github.com/bcmi/Causal-VidQA.git}.

* To appear in CVPR 2022

View paper on

Share this with someone who'll enjoy it:

Title:From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering

Paper and Code