Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yiting Chen

B4P: Simultaneous Grasp and Motion Planning for Object Placement via Parallelized Bidirectional Forests and Path Repair

Apr 06, 2025

Benjamin H. Leebron, Kejia Ren, Yiting Chen, Kaiyu Hang

Abstract:Robot pick and place systems have traditionally decoupled grasp, placement, and motion planning to build sequential optimization pipelines with the assumption that the individual components will be able to work together. However, this separation introduces sub-optimality, as grasp choices may limit or even prohibit feasible motions for a robot to reach the target placement pose, particularly in cluttered environments with narrow passages. To this end, we propose a forest-based planning framework to simultaneously find grasp configurations and feasible robot motions that explicitly satisfy downstream placement configurations paired with the selected grasps. Our proposed framework leverages a bidirectional sampling-based approach to build a start forest, rooted at the feasible grasp regions, and a goal forest, rooted at the feasible placement regions, to facilitate the search through randomly explored motions that connect valid pairs of grasp and placement trees. We demonstrate that the framework's inherent parallelism enables superlinear speedup, making it scalable for applications for redundant robot arms (e.g., 7 Degrees of Freedom) to work efficiently in highly cluttered environments. Extensive experiments in simulation demonstrate the robustness and efficiency of the proposed framework in comparison with multiple baselines under diverse scenarios.

Via

Access Paper or Ask Questions

ARC-Calib: Autonomous Markerless Camera-to-Robot Calibration via Exploratory Robot Motions

Mar 18, 2025

Podshara Chanrungmaneekul, Yiting Chen, Joshua T. Grace, Aaron M. Dollar, Kaiyu Hang

Abstract:Camera-to-robot (also known as eye-to-hand) calibration is a critical component of vision-based robot manipulation. Traditional marker-based methods often require human intervention for system setup. Furthermore, existing autonomous markerless calibration methods typically rely on pre-trained robot tracking models that impede their application on edge devices and require fine-tuning for novel robot embodiments. To address these limitations, this paper proposes a model-based markerless camera-to-robot calibration framework, ARC-Calib, that is fully autonomous and generalizable across diverse robots and scenarios without requiring extensive data collection or learning. First, exploratory robot motions are introduced to generate easily trackable trajectory-based visual patterns in the camera's image frames. Then, a geometric optimization framework is proposed to exploit the coplanarity and collinearity constraints from the observed motions to iteratively refine the estimated calibration result. Our approach eliminates the need for extra effort in either environmental marker setup or data collection and model training, rendering it highly adaptable across a wide range of real-world autonomous systems. Extensive experiments are conducted in both simulation and the real world to validate its robustness and generalizability.

* 8 pages, 9 figures

Via

Access Paper or Ask Questions

ImplicitCell: Resolution Cell Modeling of Joint Implicit Volume Reconstruction and Pose Refinement in Freehand 3D Ultrasound

Mar 09, 2025

Sheng Song, Yiting Chen, Duo Xu, Songhan Ge, Yunqian Huang, Junni Shi, Man Chen, Hongbo Chen, Rui Zheng

Abstract:Freehand 3D ultrasound enables volumetric imaging by tracking a conventional ultrasound probe during freehand scanning, offering enriched spatial information that improves clinical diagnosis. However, the quality of reconstructed volumes is often compromised by tracking system noise and irregular probe movements, leading to artifacts in the final reconstruction. To address these challenges, we propose ImplicitCell, a novel framework that integrates Implicit Neural Representation (INR) with an ultrasound resolution cell model for joint optimization of volume reconstruction and pose refinement. Three distinct datasets are used for comprehensive validation, including phantom, common carotid artery, and carotid atherosclerosis. Experimental results demonstrate that ImplicitCell significantly reduces reconstruction artifacts and improves volume quality compared to existing methods, particularly in challenging scenarios with noisy tracking data. These improvements enhance the clinical utility of freehand 3D ultrasound by providing more reliable and precise diagnostic information.

Via

Access Paper or Ask Questions

OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning

Mar 14, 2024

Lingyi Hong, Shilin Yan, Renrui Zhang, Wanyun Li, Xinyu Zhou, Pinxue Guo, Kaixun Jiang, Yiting Chen, Jinglun Li, Zhaoyu Chen(+1 more)

Figure 1 for OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning

Figure 2 for OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning

Figure 3 for OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning

Figure 4 for OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning

Abstract:Visual object tracking aims to localize the target object of each frame based on its initial appearance in the first frame. Depending on the input modility, tracking tasks can be divided into RGB tracking and RGB+X (e.g. RGB+N, and RGB+D) tracking. Despite the different input modalities, the core aspect of tracking is the temporal matching. Based on this common ground, we present a general framework to unify various tracking tasks, termed as OneTracker. OneTracker first performs a large-scale pre-training on a RGB tracker called Foundation Tracker. This pretraining phase equips the Foundation Tracker with a stable ability to estimate the location of the target object. Then we regard other modality information as prompt and build Prompt Tracker upon Foundation Tracker. Through freezing the Foundation Tracker and only adjusting some additional trainable parameters, Prompt Tracker inhibits the strong localization ability from Foundation Tracker and achieves parameter-efficient finetuning on downstream RGB+X tracking tasks. To evaluate the effectiveness of our general framework OneTracker, which is consisted of Foundation Tracker and Prompt Tracker, we conduct extensive experiments on 6 popular tracking tasks across 11 benchmarks and our OneTracker outperforms other models and achieves state-of-the-art performance.

* Accepted to CVPR 2024

Via

Access Paper or Ask Questions

Going Beyond Neural Network Feature Similarity: The Network Feature Complexity and Its Interpretation Using Category Theory

Oct 10, 2023

Yiting Chen, Zhanpeng Zhou, Junchi Yan

Abstract:The behavior of neural networks still remains opaque, and a recently widely noted phenomenon is that networks often achieve similar performance when initialized with different random parameters. This phenomenon has attracted significant attention in measuring the similarity between features learned by distinct networks. However, feature similarity could be vague in describing the same feature since equivalent features hardly exist. In this paper, we expand the concept of equivalent feature and provide the definition of what we call functionally equivalent features. These features produce equivalent output under certain transformations. Using this definition, we aim to derive a more intrinsic metric for the so-called feature complexity regarding the redundancy of features learned by a neural network at each layer. We offer a formal interpretation of our approach through the lens of category theory, a well-developed area in mathematics. To quantify the feature complexity, we further propose an efficient algorithm named Iterative Feature Merging. Our experimental results validate our ideas and theories from various perspectives. We empirically demonstrate that the functionally equivalence widely exists among different features learned by the same neural network and we could reduce the number of parameters of the network without affecting the performance.The IFM shows great potential as a data-agnostic model prune method. We have also drawn several interesting empirical findings regarding the defined feature complexity.

Via

Access Paper or Ask Questions

Differentiable Robot Neural Distance Function for Adaptive Grasp Synthesis on a Unified Robotic Arm-Hand System

Sep 28, 2023

Yiting Chen, Xiao Gao, Kunpeng Yao, Loïc Niederhauser, Yasemin Bekiroglu, Aude Billard

Abstract:Grasping is a fundamental skill for robots to interact with their environment. While grasp execution requires coordinated movement of the hand and arm to achieve a collision-free and secure grip, many grasp synthesis studies address arm and hand motion planning independently, leading to potentially unreachable grasps in practical settings. The challenge of determining integrated arm-hand configurations arises from its computational complexity and high-dimensional nature. We address this challenge by presenting a novel differentiable robot neural distance function. Our approach excels in capturing intricate geometry across various joint configurations while preserving differentiability. This innovative representation proves instrumental in efficiently addressing downstream tasks with stringent contact constraints. Leveraging this, we introduce an adaptive grasp synthesis framework that exploits the full potential of the unified arm-hand system for diverse grasping tasks. Our neural joint space distance function achieves an 84.7% error reduction compared to baseline methods. We validated our approaches on a unified robotic arm-hand system that consists of a 7-DoF robot arm and a 16-DoF multi-fingered robotic hand. Results demonstrate that our approach empowers this high-DoF system to generate and execute various arm-hand grasp configurations that adapt to the size of the target objects while ensuring whole-body movements to be collision-free.

* Under review

Via

Access Paper or Ask Questions

Sliding Touch-based Exploration for Modeling Unknown Object Shape with Multi-fingered Hands

Aug 01, 2023

Yiting Chen, Ahmet Ercan Tekden, Marc Peter Deisenroth, Yasemin Bekiroglu

Figure 1 for Sliding Touch-based Exploration for Modeling Unknown Object Shape with Multi-fingered Hands

Figure 2 for Sliding Touch-based Exploration for Modeling Unknown Object Shape with Multi-fingered Hands

Figure 3 for Sliding Touch-based Exploration for Modeling Unknown Object Shape with Multi-fingered Hands

Figure 4 for Sliding Touch-based Exploration for Modeling Unknown Object Shape with Multi-fingered Hands

Abstract:Efficient and accurate 3D object shape reconstruction contributes significantly to the success of a robot's physical interaction with its environment. Acquiring accurate shape information about unknown objects is challenging, especially in unstructured environments, e.g. the vision sensors may only be able to provide a partial view. To address this issue, tactile sensors could be employed to extract local surface information for more robust unknown object shape estimation. In this paper, we propose a novel approach for efficient unknown 3D object shape exploration and reconstruction using a multi-fingered hand equipped with tactile sensors and a depth camera only providing a partial view. We present a multi-finger sliding touch strategy for efficient shape exploration using a Bayesian Optimization approach and a single-leader-multi-follower strategy for multi-finger smooth local surface perception. We evaluate our proposed method by estimating the 3D shape of objects from the YCB and OCRTOC datasets based on simulation and real robot experiments. The proposed approach yields successful reconstruction results relying on only a few continuous sliding touches. Experimental results demonstrate that our method is able to model unknown objects in an efficient and accurate way.

* 8 pages, 11 figures. Accepted by IROS 2023

Via

Access Paper or Ask Questions

Energy-based Out-of-Distribution Detection for Graph Neural Networks

Feb 06, 2023

Qitian Wu, Yiting Chen, Chenxiao Yang, Junchi Yan

Abstract:Learning on graphs, where instance nodes are inter-connected, has become one of the central problems for deep learning, as relational structures are pervasive and induce data inter-dependence which hinders trivial adaptation of existing approaches that assume inputs to be i.i.d.~sampled. However, current models mostly focus on improving testing performance of in-distribution data and largely ignore the potential risk w.r.t. out-of-distribution (OOD) testing samples that may cause negative outcome if the prediction is overconfident on them. In this paper, we investigate the under-explored problem, OOD detection on graph-structured data, and identify a provably effective OOD discriminator based on an energy function directly extracted from graph neural networks trained with standard classification loss. This paves a way for a simple, powerful and efficient OOD detection model for GNN-based learning on graphs, which we call GNNSafe. It also has nice theoretical properties that guarantee an overall distinguishable margin between the detection scores for in-distribution and OOD samples, which, more critically, can be further strengthened by a learning-free energy belief propagation scheme. For comprehensive evaluation, we introduce new benchmark settings that evaluate the model for detecting OOD data from both synthetic and real distribution shifts (cross-domain graph shifts and temporal graph shifts). The results show that GNNSafe achieves up to $17.0\%$ AUROC improvement over state-of-the-arts and it could serve as simple yet strong baselines in such an under-developed area.

* Accepted by International Conference on Learning Representations (ICLR 2023)

Via

Access Paper or Ask Questions

Robot Cooking with Stir-fry: Bimanual Non-prehensile Manipulation of Semi-fluid Objects

May 12, 2022

Junjia Liu, Yiting Chen, Zhipeng Dong, Shixiong Wang, Sylvain Calinon, Miao Li, Fei Chen

Figure 1 for Robot Cooking with Stir-fry: Bimanual Non-prehensile Manipulation of Semi-fluid Objects

Figure 2 for Robot Cooking with Stir-fry: Bimanual Non-prehensile Manipulation of Semi-fluid Objects

Figure 3 for Robot Cooking with Stir-fry: Bimanual Non-prehensile Manipulation of Semi-fluid Objects

Figure 4 for Robot Cooking with Stir-fry: Bimanual Non-prehensile Manipulation of Semi-fluid Objects

Abstract:This letter describes an approach to achieve well-known Chinese cooking art stir-fry on a bimanual robot system. Stir-fry requires a sequence of highly dynamic coordinated movements, which is usually difficult to learn for a chef, let alone transfer to robots. In this letter, we define a canonical stir-fry movement, and then propose a decoupled framework for learning this deformable object manipulation from human demonstration. First, the dual arms of the robot are decoupled into different roles (a leader and follower) and learned with classical and neural network-based methods separately, then the bimanual task is transformed into a coordination problem. To obtain general bimanual coordination, we secondly propose a Graph and Transformer based model -- Structured-Transformer, to capture the spatio-temporal relationship between dual-arm movements. Finally, by adding visual feedback of content deformation, our framework can adjust the movements automatically to achieve the desired stir-fry effect. We verify the framework by a simulator and deploy it on a real bimanual Panda robot system. The experimental results validate our framework can realize the bimanual robot stir-fry motion and have the potential to extend to other deformable objects with bimanual coordination.

* IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 5159-5166, April 2022
* 8 pages, 8 figures, published to RA-L

Via

Access Paper or Ask Questions

A Unified Game-Theoretic Interpretation of Adversarial Robustness

Nov 08, 2021

Jie Ren, Die Zhang, Yisen Wang, Lu Chen, Zhanpeng Zhou, Yiting Chen, Xu Cheng, Xin Wang, Meng Zhou, Jie Shi(+1 more)

Figure 1 for A Unified Game-Theoretic Interpretation of Adversarial Robustness

Figure 2 for A Unified Game-Theoretic Interpretation of Adversarial Robustness

Figure 3 for A Unified Game-Theoretic Interpretation of Adversarial Robustness

Figure 4 for A Unified Game-Theoretic Interpretation of Adversarial Robustness

Abstract:This paper provides a unified view to explain different adversarial attacks and defense methods, \emph{i.e.} the view of multi-order interactions between input variables of DNNs. Based on the multi-order interaction, we discover that adversarial attacks mainly affect high-order interactions to fool the DNN. Furthermore, we find that the robustness of adversarially trained DNNs comes from category-specific low-order interactions. Our findings provide a potential method to unify adversarial perturbations and robustness, which can explain the existing defense methods in a principle way. Besides, our findings also make a revision of previous inaccurate understanding of the shape bias of adversarially learned features.

* the previous version is arXiv:2103.07364, but I mistakenly apply a new ID for the paper

Via

Access Paper or Ask Questions