Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning

Nov 29, 2023

Xiaoqian Wu, Yong-Lu Li, Jianhua Sun, Cewu Lu

Figure 1 for Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning

Figure 2 for Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning

Figure 3 for Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning

Figure 4 for Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning

Share this with someone who'll enjoy it:

Abstract:Human reasoning can be understood as a cooperation between the intuitive, associative "System-1" and the deliberative, logical "System-2". For existing System-1-like methods in visual activity understanding, it is crucial to integrate System-2 processing to improve explainability, generalization, and data efficiency. One possible path of activity reasoning is building a symbolic system composed of symbols and rules, where one rule connects multiple symbols, implying human knowledge and reasoning abilities. Previous methods have made progress, but are defective with limited symbols from handcraft and limited rules from visual-based annotations, failing to cover the complex patterns of activities and lacking compositional generalization. To overcome the defects, we propose a new symbolic system with two ideal important properties: broad-coverage symbols and rational rules. Collecting massive human knowledge via manual annotations is expensive to instantiate this symbolic system. Instead, we leverage the recent advancement of LLMs (Large Language Models) as an approximation of the two ideal properties, i.e., Symbols from Large Language Models (Symbol-LLM). Then, given an image, visual contents from the images are extracted and checked as symbols and activity semantics are reasoned out based on rules via fuzzy logic calculation. Our method shows superiority in extensive activity understanding tasks. Code and data are available at https://mvig-rhos.com/symbol_llm.

* Accepted by NeurIPS 2023

View paper on

Share this with someone who'll enjoy it:

Title:Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning

Paper and Code