Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yinghao Zhao

OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints

Jan 07, 2025

Mingjie Pan, Jiyao Zhang, Tianshu Wu, Yinghao Zhao, Wenlong Gao, Hao Dong

Figure 1 for OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints

Figure 2 for OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints

Figure 3 for OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints

Figure 4 for OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints

Abstract:The development of general robotic systems capable of manipulating in unstructured environments is a significant challenge. While Vision-Language Models(VLM) excel in high-level commonsense reasoning, they lack the fine-grained 3D spatial understanding required for precise manipulation tasks. Fine-tuning VLM on robotic datasets to create Vision-Language-Action Models(VLA) is a potential solution, but it is hindered by high data collection costs and generalization issues. To address these challenges, we propose a novel object-centric representation that bridges the gap between VLM's high-level reasoning and the low-level precision required for manipulation. Our key insight is that an object's canonical space, defined by its functional affordances, provides a structured and semantically meaningful way to describe interaction primitives, such as points and directions. These primitives act as a bridge, translating VLM's commonsense reasoning into actionable 3D spatial constraints. In this context, we introduce a dual closed-loop, open-vocabulary robotic manipulation system: one loop for high-level planning through primitive resampling, interaction rendering and VLM checking, and another for low-level execution via 6D pose tracking. This design ensures robust, real-time control without requiring VLM fine-tuning. Extensive experiments demonstrate strong zero-shot generalization across diverse robotic manipulation tasks, highlighting the potential of this approach for automating large-scale simulation data generation.

Via

Access Paper or Ask Questions

Autonomous Exploration Method for Fast Unknown Environment Mapping by Using UAV Equipped with Limited FOV Sensor

Feb 05, 2023

Yinghao Zhao, Li Yan, Hong Xie, Jicheng Dai, Pengcheng Wei

Abstract:Autonomous exploration is one of the important parts to achieve the fast autonomous mapping and target search. However, most of the existing methods are facing low-efficiency problems caused by low-quality trajectory or back-and-forth maneuvers. To improve the exploration efficiency in unknown environments, a fast autonomous exploration planner (FAEP) is proposed in this paper. Different from existing methods, we firstly design a novel frontiers exploration sequence generation method to obtain a more reasonable exploration path, which considers not only the flight-level but frontier-level factors in the asymmetric traveling salesman problem (ATSP). Then, according to the exploration sequence and the distribution of frontiers, an adaptive yaw planning method is proposed to cover more frontiers by yaw change during an exploration journey. In addition, to increase the speed and fluency of flight, a dynamic replanning strategy is also adopted. We present sufficient comparison and evaluation experiments in simulation environments. Experimental results show the proposed exploration planner has better performance in terms of flight time and flight distance compared to typical and state-of-the-art methods. Moreover, the effectiveness of the proposed method is further evaluated in real-world environments.

* 10 pages,10 figures. arXiv admin note: substantial text overlap with arXiv:2202.12507

Via

Access Paper or Ask Questions

Learning to Fill the Seam by Vision: Sub-millimeter Peg-in-hole on Unseen Shapes in Real World

Apr 20, 2022

Liang Xie, Hongxiang Yu, Yinghao Zhao, Haodong Zhang, Zhongxiang Zhou, Minhang Wang, Yue Wang, Rong Xiong

Figure 1 for Learning to Fill the Seam by Vision: Sub-millimeter Peg-in-hole on Unseen Shapes in Real World

Figure 2 for Learning to Fill the Seam by Vision: Sub-millimeter Peg-in-hole on Unseen Shapes in Real World

Figure 3 for Learning to Fill the Seam by Vision: Sub-millimeter Peg-in-hole on Unseen Shapes in Real World

Figure 4 for Learning to Fill the Seam by Vision: Sub-millimeter Peg-in-hole on Unseen Shapes in Real World

Abstract:In the peg insertion task, human pays attention to the seam between the peg and the hole and tries to fill it continuously with visual feedback. By imitating the human behavior, we design architectures with position and orientation estimators based on the seam representation for pose alignment, which proves to be general to the unseen peg geometries. By putting the estimators into the closed-loop control with reinforcement learning, we further achieve a higher or comparable success rate, efficiency, and robustness compared with the baseline methods. The policy is trained totally in simulation without any manual intervention. To achieve sim-to-real, a learnable segmentation module with automatic data collecting and labeling can be easily trained to decouple the perception and the policy, which helps the model trained in simulation quickly adapt to the real world with negligible effort. Results are presented in simulation and on a physical robot. Code, videos, and supplemental material are available at https://github.com/xieliang555/SFN.git

* 6 pages; accepted to IEEE International Conference on Robotics and Automation 2022 (ICRA 2022)

Via

Access Paper or Ask Questions

FAEP: Fast Autonomous Exploration Planner for UAV Equipped with Limited FOV Sensor

Feb 25, 2022

Yinghao Zhao, Li Yan, Yu Chen, Hong Xie, Bo Xu

Figure 1 for FAEP: Fast Autonomous Exploration Planner for UAV Equipped with Limited FOV Sensor

Figure 2 for FAEP: Fast Autonomous Exploration Planner for UAV Equipped with Limited FOV Sensor

Figure 3 for FAEP: Fast Autonomous Exploration Planner for UAV Equipped with Limited FOV Sensor

Figure 4 for FAEP: Fast Autonomous Exploration Planner for UAV Equipped with Limited FOV Sensor

Abstract:Autonomous exploration is one of the important parts to achieve the autonomous operation of Unmanned Aerial Vehicles (UAVs). To improve the efficiency of the exploration process, a fast and autonomous exploration planner (FAEP) is proposed in this paper. We firstly design a novel frontiers exploration sequence generation method to obtain a more reasonable exploration path, which considers not only the flight-level but frontier-level factors into TSP. According to the exploration sequence and the distribution of frontiers, a two-stage heading planning strategy is proposed to cover more frontiers by heading change during an exploration journey. To improve the stability of path searching, a guided kinodynamic path searching based on a guiding path is devised. In addition, a dynamic start point selection method for replanning is also adopted to increase the fluency of flight. We present sufficient benchmark and real-world experiments. Experimental results show the superiority of the proposed exploration planner compared with typical and state-of-the-art methods.

Via

Access Paper or Ask Questions