Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ziyuan Jiao

IKDiffuser: Fast and Diverse Inverse Kinematics Solution Generation for Multi-arm Robotic Systems

Jun 16, 2025

Zeyu Zhang, Ziyuan Jiao

Abstract:Solving Inverse Kinematics (IK) problems is fundamental to robotics, but has primarily been successful with single serial manipulators. For multi-arm robotic systems, IK remains challenging due to complex self-collisions, coupled joints, and high-dimensional redundancy. These complexities make traditional IK solvers slow, prone to failure, and lacking in solution diversity. In this paper, we present IKDiffuser, a diffusion-based model designed for fast and diverse IK solution generation for multi-arm robotic systems. IKDiffuser learns the joint distribution over the configuration space, capturing complex dependencies and enabling seamless generalization to multi-arm robotic systems of different structures. In addition, IKDiffuser can incorporate additional objectives during inference without retraining, offering versatility and adaptability for task-specific requirements. In experiments on 6 different multi-arm systems, the proposed IKDiffuser achieves superior solution accuracy, precision, diversity, and computational efficiency compared to existing solvers. The proposed IKDiffuser framework offers a scalable, unified approach to solving multi-arm IK problems, facilitating the potential of multi-arm robotic systems in real-time manipulation tasks.

* under review

Via

Access Paper or Ask Questions

PP-Tac: Paper Picking Using Tactile Feedback in Dexterous Robotic Hands

Apr 23, 2025

Pei Lin, Yuzhe Huang, Wanlin Li, Jianpeng Ma, Chenxi Xiao, Ziyuan Jiao

Abstract:Robots are increasingly envisioned as human companions, assisting with everyday tasks that often involve manipulating deformable objects. Although recent advances in robotic hardware and embodied AI have expanded their capabilities, current systems still struggle with handling thin, flat, and deformable objects such as paper and fabric. This limitation arises from the lack of suitable perception techniques for robust state estimation under diverse object appearances, as well as the absence of planning techniques for generating appropriate grasp motions. To bridge these gaps, this paper introduces PP-Tac, a robotic system for picking up paper-like objects. PP-Tac features a multi-fingered robotic hand with high-resolution omnidirectional tactile sensors \sensorname. This hardware configuration enables real-time slip detection and online frictional force control that mitigates such slips. Furthermore, grasp motion generation is achieved through a trajectory synthesis pipeline, which first constructs a dataset of finger's pinching motions. Based on this dataset, a diffusion-based policy is trained to control the hand-arm robotic system. Experiments demonstrate that PP-Tac can effectively grasp paper-like objects of varying material, thickness, and stiffness, achieving an overall success rate of 87.5\%. To our knowledge, this work is the first attempt to grasp paper-like deformable objects using a tactile dexterous hand. Our project webpage can be found at: https://peilin-666.github.io/projects/PP-Tac/

* accepted by Robotics: Science and Systems(RSS) 2025

Via

Access Paper or Ask Questions

Flight Structure Optimization of Modular Reconfigurable UAVs

Jul 04, 2024

Yao Su, Ziyuan Jiao, Zeyu Zhang, Jingwen Zhang, Hang Li, Meng Wang, Hangxin Liu

Abstract:This paper presents a Genetic Algorithm (GA) designed to reconfigure a large group of modular Unmanned Aerial Vehicles (UAVs), each with different weights and inertia parameters, into an over-actuated flight structure with improved dynamic properties. Previous research efforts either utilized expert knowledge to design flight structures for a specific task or relied on enumeration-based algorithms that required extensive computation to find an optimal one. However, both approaches encounter challenges in accommodating the heterogeneity among modules. Our GA addresses these challenges by incorporating the complexities of over-actuation and dynamic properties into its formulation. Additionally, we employ a tree representation and a vector representation to describe flight structures, facilitating efficient crossover operations and fitness evaluations within the GA framework, respectively. Using cubic modular quadcopters capable of functioning as omni-directional thrust generators, we validate that the proposed approach can (i) adeptly identify suboptimal configurations ensuring over-actuation while ensuring trajectory tracking accuracy and (ii) significantly reduce computational costs compared to traditional enumeration-based methods.

Via

Access Paper or Ask Questions

Dynamic Planning for Sequential Whole-body Mobile Manipulation

May 24, 2024

Zhitian Li, Yida Niu, Yao Su, Hangxin Liu, Ziyuan Jiao

Figure 1 for Dynamic Planning for Sequential Whole-body Mobile Manipulation

Figure 2 for Dynamic Planning for Sequential Whole-body Mobile Manipulation

Figure 3 for Dynamic Planning for Sequential Whole-body Mobile Manipulation

Figure 4 for Dynamic Planning for Sequential Whole-body Mobile Manipulation

Abstract:The dynamic Sequential Mobile Manipulation Planning (SMMP) framework is essential for the safe and robust operation of mobile manipulators in dynamic environments. Previous research has primarily focused on either motion-level or task-level dynamic planning, with limitations in handling state changes that have long-term effects or in generating responsive motions for diverse tasks, respectively. This paper presents a holistic dynamic planning framework that extends the Virtual Kinematic Chain (VKC)-based SMMP method, automating dynamic long-term task planning and reactive whole-body motion generation for SMMP problems. The framework consists of an online task planning module designed to respond to environment changes with long-term effects, a VKC-based whole-body motion planning module for manipulating both rigid and articulated objects, alongside a reactive Model Predictive Control (MPC) module for obstacle avoidance during execution. Simulations and real-world experiments validate the framework, demonstrating its efficacy and validity across sequential mobile manipulation tasks, even in scenarios involving human interference.

Via

Access Paper or Ask Questions

Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V

Apr 16, 2024

Peiyuan Zhi, Zhiyuan Zhang, Muzhi Han, Zeyu Zhang, Zhitian Li, Ziyuan Jiao, Baoxiong Jia, Siyuan Huang

Figure 1 for Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V

Figure 2 for Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V

Figure 3 for Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V

Figure 4 for Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V

Abstract:Autonomous robot navigation and manipulation in open environments require reasoning and replanning with closed-loop feedback. We present COME-robot, the first closed-loop framework utilizing the GPT-4V vision-language foundation model for open-ended reasoning and adaptive planning in real-world scenarios. We meticulously construct a library of action primitives for robot exploration, navigation, and manipulation, serving as callable execution modules for GPT-4V in task planning. On top of these modules, GPT-4V serves as the brain that can accomplish multimodal reasoning, generate action policy with code, verify the task progress, and provide feedback for replanning. Such design enables COME-robot to (i) actively perceive the environments, (ii) perform situated reasoning, and (iii) recover from failures. Through comprehensive experiments involving 8 challenging real-world tabletop and manipulation tasks, COME-robot demonstrates a significant improvement in task success rate (~25%) compared to state-of-the-art baseline methods. We further conduct comprehensive analyses to elucidate how COME-robot's design facilitates failure recovery, free-form instruction following, and long-horizon task planning.

Via

Access Paper or Ask Questions

LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning

Mar 20, 2024

Shu Wang, Muzhi Han, Ziyuan Jiao, Zeyu Zhang, Ying Nian Wu, Song-Chun Zhu, Hangxin Liu

Figure 1 for LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning

Figure 2 for LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning

Figure 3 for LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning

Figure 4 for LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning

Abstract:Conventional Task and Motion Planning (TAMP) approaches rely on manually crafted interfaces connecting symbolic task planning with continuous motion generation. These domain-specific and labor-intensive modules are limited in addressing emerging tasks in real-world settings. Here, we present LLM^3, a novel Large Language Model (LLM)-based TAMP framework featuring a domain-independent interface. Specifically, we leverage the powerful reasoning and planning capabilities of pre-trained LLMs to propose symbolic action sequences and select continuous action parameters for motion planning. Crucially, LLM^3 incorporates motion planning feedback through prompting, allowing the LLM to iteratively refine its proposals by reasoning about motion failure. Consequently, LLM^3 interfaces between task planning and motion planning, alleviating the intricate design process of handling domain-specific messages between them. Through a series of simulations in a box-packing domain, we quantitatively demonstrate the effectiveness of LLM^3 in solving TAMP problems and the efficiency in selecting action parameters. Ablation studies underscore the significant contribution of motion failure reasoning to the success of LLM^3. Furthermore, we conduct qualitative experiments on a physical manipulator, demonstrating the practical applicability of our approach in real-world settings.

* Submitted to IROS 2024. Codes available: https://github.com/AssassinWS/LLM-TAMP

Via

Access Paper or Ask Questions

On the Emergence of Symmetrical Reality

Jan 26, 2024

Zhenliang Zhang, Zeyu Zhang, Ziyuan Jiao, Yao Su, Hangxin Liu, Wei Wang, Song-Chun Zhu

Abstract:Artificial intelligence (AI) has revolutionized human cognitive abilities and facilitated the development of new AI entities capable of interacting with humans in both physical and virtual environments. Despite the existence of virtual reality, mixed reality, and augmented reality for several years, integrating these technical fields remains a formidable challenge due to their disparate application directions. The advent of AI agents, capable of autonomous perception and action, further compounds this issue by exposing the limitations of traditional human-centered research approaches. It is imperative to establish a comprehensive framework that accommodates the dual perceptual centers of humans and AI agents in both physical and virtual worlds. In this paper, we introduce the symmetrical reality framework, which offers a unified representation encompassing various forms of physical-virtual amalgamations. This framework enables researchers to better comprehend how AI agents can collaborate with humans and how distinct technical pathways of physical-virtual integration can be consolidated from a broader perspective. We then delve into the coexistence of humans and AI, demonstrating a prototype system that exemplifies the operation of symmetrical reality systems for specific tasks, such as pouring water. Subsequently, we propose an instance of an AI-driven active assistance service that illustrates the potential applications of symmetrical reality. This paper aims to offer beneficial perspectives and guidance for researchers and practitioners in different fields, thus contributing to the ongoing research about human-AI coexistence in both physical and virtual environments.

* IEEE VR 2024

Via

Access Paper or Ask Questions

Get the Ball Rolling: Alerting Autonomous Robots When to Help to Close the Healthcare Loop

Nov 05, 2023

Jiaxin Shen, Yanyao Liu, Ziming Wang, Ziyuan Jiao, Yufeng Chen, Wenjuan Han

Figure 1 for Get the Ball Rolling: Alerting Autonomous Robots When to Help to Close the Healthcare Loop

Figure 2 for Get the Ball Rolling: Alerting Autonomous Robots When to Help to Close the Healthcare Loop

Figure 3 for Get the Ball Rolling: Alerting Autonomous Robots When to Help to Close the Healthcare Loop

Figure 4 for Get the Ball Rolling: Alerting Autonomous Robots When to Help to Close the Healthcare Loop

Abstract:To facilitate the advancement of research in healthcare robots without human intervention or commands, we introduce the Autonomous Helping Challenge, along with a crowd-sourcing large-scale dataset. The goal is to create healthcare robots that possess the ability to determine when assistance is necessary, generate useful sub-tasks to aid in planning, carry out these plans through a physical robot, and receive feedback from the environment in order to generate new tasks and continue the process. Besides the general challenge in open-ended scenarios, Autonomous Helping focuses on three specific challenges: autonomous task generation, the gap between the current scene and static commonsense, and the gap between language instruction and the real world. Additionally, we propose Helpy, a potential approach to close the healthcare loop in the learning-free setting.

* 12 pages, 6 figures

Via

Access Paper or Ask Questions

Part-level Scene Reconstruction Affords Robot Interaction

Aug 01, 2023

Zeyu Zhang, Lexing Zhang, Zaijin Wang, Ziyuan Jiao, Muzhi Han, Yixin Zhu, Song-Chun Zhu, Hangxin Liu

Abstract:Existing methods for reconstructing interactive scenes primarily focus on replacing reconstructed objects with CAD models retrieved from a limited database, resulting in significant discrepancies between the reconstructed and observed scenes. To address this issue, our work introduces a part-level reconstruction approach that reassembles objects using primitive shapes. This enables us to precisely replicate the observed physical scenes and simulate robot interactions with both rigid and articulated objects. By segmenting reconstructed objects into semantic parts and aligning primitive shapes to these parts, we assemble them as CAD models while estimating kinematic relations, including parent-child contact relations, joint types, and parameters. Specifically, we derive the optimal primitive alignment by solving a series of optimization problems, and estimate kinematic relations based on part semantics and geometry. Our experiments demonstrate that part-level scene reconstruction outperforms object-level reconstruction by accurately capturing finer details and improving precision. These reconstructed part-level interactive scenes provide valuable kinematic information for various robotic applications; we showcase the feasibility of certifying mobile manipulation planning in these interactive scenes before executing tasks in the physical world.

* IROS 2023 paper

Via

Access Paper or Ask Questions

Sequential Manipulation Planning for Over-actuated Unmanned Aerial Manipulators

Jul 11, 2023

Yao Su, Jiarui Li, Ziyuan Jiao, Meng Wang, Chi Chu, Hang Li, Yixin Zhu, Hangxin Liu

Abstract:We investigate the sequential manipulation planning problem for unmanned aerial manipulators (UAMs). Unlike prior work that primarily focuses on one-step manipulation tasks, sequential manipulations require coordinated motions of a UAM's floating base, the manipulator, and the object being manipulated, entailing a unified kinematics and dynamics model for motion planning under designated constraints. By leveraging a virtual kinematic chain (VKC)-based motion planning framework that consolidates components' kinematics into one chain, the sequential manipulation task of a UAM can be planned as a whole, yielding more coordinated motions. Integrating the kinematics and dynamics models with a hierarchical control framework, we demonstrate, for the first time, an over-actuated UAM achieves a series of new sequential manipulation capabilities in both simulation and experiment.

* IROS 2023

Via

Access Paper or Ask Questions