Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jicong Ao

HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning

Jul 01, 2025

Zhi Jing, Siyuan Yang, Jicong Ao, Ting Xiao, Yugang Jiang, Chenjia Bai

Abstract:For robotic manipulation, existing robotics datasets and simulation benchmarks predominantly cater to robot-arm platforms. However, for humanoid robots equipped with dual arms and dexterous hands, simulation tasks and high-quality demonstrations are notably lacking. Bimanual dexterous manipulation is inherently more complex, as it requires coordinated arm movements and hand operations, making autonomous data collection challenging. This paper presents HumanoidGen, an automated task creation and demonstration collection framework that leverages atomic dexterous operations and LLM reasoning to generate relational constraints. Specifically, we provide spatial annotations for both assets and dexterous hands based on the atomic operations, and perform an LLM planner to generate a chain of actionable spatial constraints for arm movements based on object affordances and scenes. To further improve planning ability, we employ a variant of Monte Carlo tree search to enhance LLM reasoning for long-horizon tasks and insufficient annotation. In experiments, we create a novel benchmark with augmented scenarios to evaluate the quality of the collected data. The results show that the performance of the 2D and 3D diffusion policies can scale with the generated dataset. Project page is https://openhumanoidgen.github.io.

* Project Page: https://openhumanoidgen.github.io

Via

Access Paper or Ask Questions

LLM as BT-Planner: Leveraging LLMs for Behavior Tree Generation in Robot Task Planning

Sep 16, 2024

Jicong Ao, Fan Wu, Yansong Wu, Abdalla Swikir, Sami Haddadin

Figure 1 for LLM as BT-Planner: Leveraging LLMs for Behavior Tree Generation in Robot Task Planning

Figure 2 for LLM as BT-Planner: Leveraging LLMs for Behavior Tree Generation in Robot Task Planning

Figure 3 for LLM as BT-Planner: Leveraging LLMs for Behavior Tree Generation in Robot Task Planning

Figure 4 for LLM as BT-Planner: Leveraging LLMs for Behavior Tree Generation in Robot Task Planning

Abstract:Robotic assembly tasks are open challenges due to the long task horizon and complex part relations. Behavior trees (BTs) are increasingly used in robot task planning for their modularity and flexibility, but manually designing them can be effort-intensive. Large language models (LLMs) have recently been applied in robotic task planning for generating action sequences, but their ability to generate BTs has not been fully investigated. To this end, We propose LLM as BT-planner, a novel framework to leverage LLMs for BT generation in robotic assembly task planning and execution. Four in-context learning methods are introduced to utilize the natural language processing and inference capabilities of LLMs to produce task plans in BT format, reducing manual effort and ensuring robustness and comprehensibility. We also evaluate the performance of fine-tuned, fewer-parameter LLMs on the same tasks. Experiments in simulated and real-world settings show that our framework enhances LLMs' performance in BT generation, improving success rates in BT generation through in-context learning and supervised fine-tuning.

* 8 pages

Via

Access Paper or Ask Questions

Behavior Tree Generation using Large Language Models for Sequential Manipulation Planning with Human Instructions and Feedback

Sep 14, 2024

Jicong Ao, Yansong Wu, Fan Wu, Sami Haddadin

Abstract:In this work, we propose an LLM-based BT generation framework to leverage the strengths of both for sequential manipulation planning. To enable human-robot collaborative task planning and enhance intuitive robot programming by nonexperts, the framework takes human instructions to initiate the generation of action sequences and human feedback to refine BT generation in runtime. All presented methods within the framework are tested on a real robotic assembly example, which uses a gear set model from the Siemens Robot Assembly Challenge. We use a single manipulator with a tool-changing mechanism, a common practice in flexible manufacturing, to facilitate robust grasping of a large variety of objects. Experimental results are evaluated regarding success rate, logical coherence, executability, time consumption, and token consumption. To our knowledge, this is the first human-guided LLM-based BT generation framework that unifies various plausible ways of using LLMs to fully generate BTs that are executable on the real testbed and take into account granular knowledge of tool use.

* ICRA 2024 Workshop Exploring Role Allocation in Human-Robot Co-Manipulation

Via

Access Paper or Ask Questions