Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Linhang Wang

Can LLM find the green circle? Investigation and Human-guided tool manipulation for compositional generalization

Dec 12, 2023

Min Zhang, Jianfeng He, Shuo Lei, Murong Yue, Linhang Wang, Chang-Tien Lu

Figure 1 for Can LLM find the green circle? Investigation and Human-guided tool manipulation for compositional generalization

Figure 2 for Can LLM find the green circle? Investigation and Human-guided tool manipulation for compositional generalization

Figure 3 for Can LLM find the green circle? Investigation and Human-guided tool manipulation for compositional generalization

Figure 4 for Can LLM find the green circle? Investigation and Human-guided tool manipulation for compositional generalization

Abstract:The meaning of complex phrases in natural language is composed of their individual components. The task of compositional generalization evaluates a model's ability to understand new combinations of components. Previous studies trained smaller, task-specific models, which exhibited poor generalization. While large language models (LLMs) exhibit impressive generalization abilities on many tasks through in-context learning (ICL), their potential for compositional generalization remains unexplored. In this paper, we first empirically investigate prevailing ICL methods in compositional generalization. We find that they struggle with complex compositional questions due to cumulative errors in long reasoning steps and intricate logic required for tool-making. Consequently, we propose a human-guided tool manipulation framework (HTM) that generates tools for sub-questions and integrates multiple tools. Our method enhances the effectiveness of tool creation and usage with minimal human effort. Experiments show that our method achieves state-of-the-art performance on two compositional generalization benchmarks and outperforms existing methods on the most challenging test split by 70%.

* Accepted by ICASSP 2024

Via

Access Paper or Ask Questions