Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bao Thach

DefFusionNet: Learning Multimodal Goal Shapes for Deformable Object Manipulation via a Diffusion-based Probabilistic Model

Jun 23, 2025

Bao Thach, Siyeon Kim, Britton Jordan, Mohanraj Shanthi, Tanner Watts, Shing-Hei Ho, James M. Ferguson, Tucker Hermans, Alan Kuntz

Abstract:Deformable object manipulation is critical to many real-world robotic applications, ranging from surgical robotics and soft material handling in manufacturing to household tasks like laundry folding. At the core of this important robotic field is shape servoing, a task focused on controlling deformable objects into desired shapes. The shape servoing formulation requires the specification of a goal shape. However, most prior works in shape servoing rely on impractical goal shape acquisition methods, such as laborious domain-knowledge engineering or manual manipulation. DefGoalNet previously posed the current state-of-the-art solution to this problem, which learns deformable object goal shapes directly from a small number of human demonstrations. However, it significantly struggles in multi-modal settings, where multiple distinct goal shapes can all lead to successful task completion. As a deterministic model, DefGoalNet collapses these possibilities into a single averaged solution, often resulting in an unusable goal. In this paper, we address this problem by developing DefFusionNet, a novel neural network that leverages the diffusion probabilistic model to learn a distribution over all valid goal shapes rather than predicting a single deterministic outcome. This enables the generation of diverse goal shapes and avoids the averaging artifacts. We demonstrate our method's effectiveness on robotic tasks inspired by both manufacturing and surgical applications, both in simulation and on a physical robot. Our work is the first generative model capable of producing a diverse, multi-modal set of deformable object goals for real-world robotic applications.

Via

Access Paper or Ask Questions

Generative LiDAR Editing with Controllable Novel Object Layouts

Nov 30, 2024

Shing-Hei Ho, Bao Thach, Minghan Zhu

Figure 1 for Generative LiDAR Editing with Controllable Novel Object Layouts

Figure 2 for Generative LiDAR Editing with Controllable Novel Object Layouts

Figure 3 for Generative LiDAR Editing with Controllable Novel Object Layouts

Figure 4 for Generative LiDAR Editing with Controllable Novel Object Layouts

Abstract:We propose a framework to edit real-world Lidar scans with novel object layouts while preserving a realistic background environment. Compared to the synthetic data generation frameworks where Lidar point clouds are generated from scratch, our framework focuses on new scenario generation in a given background environment, and our method also provides labels for the generated data. This approach ensures the generated data remains relevant to the specific environment, aiding both the development and the evaluation of algorithms in real-world scenarios. Compared with novel view synthesis, our framework allows the creation of counterfactual scenarios with significant changes in the object layout and does not rely on multi-frame optimization. In our framework, the object removal and insertion are supported by generative background inpainting and object point cloud completion, and the entire pipeline is built upon spherical voxelization, which realizes the correct Lidar projective geometry by construction. Experiments show that our framework generates realistic Lidar scans with object layout changes and benefits the development of Lidar-based self-driving systems.

* Submitted to IEEE International Conference on Robotics and Automation (ICRA). 6 pages, 7 figures

Via

Access Paper or Ask Questions

Reward Learning from Suboptimal Demonstrations with Applications in Surgical Electrocautery

Apr 10, 2024

Zohre Karimi, Shing-Hei Ho, Bao Thach, Alan Kuntz, Daniel S. Brown

Abstract:Automating robotic surgery via learning from demonstration (LfD) techniques is extremely challenging. This is because surgical tasks often involve sequential decision-making processes with complex interactions of physical objects and have low tolerance for mistakes. Prior works assume that all demonstrations are fully observable and optimal, which might not be practical in the real world. This paper introduces a sample-efficient method that learns a robust reward function from a limited amount of ranked suboptimal demonstrations consisting of partial-view point cloud observations. The method then learns a policy by optimizing the learned reward function using reinforcement learning (RL). We show that using a learned reward function to obtain a policy is more robust than pure imitation learning. We apply our approach on a physical surgical electrocautery task and demonstrate that our method can perform well even when the provided demonstrations are suboptimal and the observations are high-dimensional point clouds.

* In proceedings of the International Symposium on Medical Robotics (ISMR) 2024. Equal contribution from two first authors

Via

Access Paper or Ask Questions

Accounting for Hysteresis in the Forward Kinematics of Nonlinearly-Routed Tendon-Driven Continuum Robots via a Learned Deep Decoder Network

Apr 04, 2024

Brian Y. Cho, Daniel S. Esser, Jordan Thompson, Bao Thach, Robert J. Webster III, Alan Kuntz

Abstract:Tendon-driven continuum robots have been gaining popularity in medical applications due to their ability to curve around complex anatomical structures, potentially reducing the invasiveness of surgery. However, accurate modeling is required to plan and control the movements of these flexible robots. Physics-based models have limitations due to unmodeled effects, leading to mismatches between model prediction and actual robot shape. Recently proposed learning-based methods have been shown to overcome some of these limitations but do not account for hysteresis, a significant source of error for these robots. To overcome these challenges, we propose a novel deep decoder neural network that predicts the complete shape of tendon-driven robots using point clouds as the shape representation, conditioned on prior configurations to account for hysteresis. We evaluate our method on a physical tendon-driven robot and show that our network model accurately predicts the robot's shape, significantly outperforming a state-of-the-art physics-based model and a learning-based model that does not account for hysteresis.

* 8 pages, 9 figures, Submitted to IEEE Robotics and Automation Letters

Via

Access Paper or Ask Questions

DefGoalNet: Contextual Goal Learning from Demonstrations For Deformable Object Manipulation

Sep 25, 2023

Bao Thach, Tanner Watts, Shing-Hei Ho, Tucker Hermans, Alan Kuntz

Figure 1 for DefGoalNet: Contextual Goal Learning from Demonstrations For Deformable Object Manipulation

Figure 2 for DefGoalNet: Contextual Goal Learning from Demonstrations For Deformable Object Manipulation

Figure 3 for DefGoalNet: Contextual Goal Learning from Demonstrations For Deformable Object Manipulation

Figure 4 for DefGoalNet: Contextual Goal Learning from Demonstrations For Deformable Object Manipulation

Abstract:Shape servoing, a robotic task dedicated to controlling objects to desired goal shapes, is a promising approach to deformable object manipulation. An issue arises, however, with the reliance on the specification of a goal shape. This goal has been obtained either by a laborious domain knowledge engineering process or by manually manipulating the object into the desired shape and capturing the goal shape at that specific moment, both of which are impractical in various robotic applications. In this paper, we solve this problem by developing a novel neural network DefGoalNet, which learns deformable object goal shapes directly from a small number of human demonstrations. We demonstrate our method's effectiveness on various robotic tasks, both in simulation and on a physical robot. Notably, in the surgical retraction task, even when trained with as few as 10 demonstrations, our method achieves a median success percentage of nearly 90%. These results mark a substantial advancement in enabling shape servoing methods to bring deformable object manipulation closer to practical, real-world applications.

* Submitted to IEEE Conference on Robotics and Automation (ICRA) 2024. 8 pages, 11 figures

Via

Access Paper or Ask Questions

DeformerNet: Learning Bimanual Manipulation of 3D Deformable Objects

May 08, 2023

Bao Thach, Brian Y. Cho, Tucker Hermans, Alan Kuntz

Figure 1 for DeformerNet: Learning Bimanual Manipulation of 3D Deformable Objects

Figure 2 for DeformerNet: Learning Bimanual Manipulation of 3D Deformable Objects

Figure 3 for DeformerNet: Learning Bimanual Manipulation of 3D Deformable Objects

Figure 4 for DeformerNet: Learning Bimanual Manipulation of 3D Deformable Objects

Abstract:Applications in fields ranging from home care to warehouse fulfillment to surgical assistance require robots to reliably manipulate the shape of 3D deformable objects. Analytic models of elastic, 3D deformable objects require numerous parameters to describe the potentially infinite degrees of freedom present in determining the object's shape. Previous attempts at performing 3D shape control rely on hand-crafted features to represent the object shape and require training of object-specific control models. We overcome these issues through the use of our novel DeformerNet neural network architecture, which operates on a partial-view point cloud of the manipulated object and a point cloud of the goal shape to learn a low-dimensional representation of the object shape. This shape embedding enables the robot to learn a visual servo controller that computes the desired robot end-effector action to iteratively deform the object toward the target shape. We demonstrate both in simulation and on a physical robot that DeformerNet reliably generalizes to object shapes and material stiffness not seen during training. Crucially, using DeformerNet, the robot successfully accomplishes three surgical sub-tasks: retraction (moving tissue aside to access a site underneath it), tissue wrapping (a sub-task in procedures like aortic stent placements), and connecting two tubular pieces of tissue (a sub-task in anastomosis).

* Submitted to IEEE Transactions on Robotics (T-RO). 18 pages, 25 figures. arXiv admin note: substantial text overlap with arXiv:2110.04685

Via

Access Paper or Ask Questions

Learning Visual Shape Control of Novel 3D Deformable Objects from Partial-View Point Clouds

Oct 10, 2021

Bao Thach, Brian Y. Cho, Alan Kuntz, Tucker Hermans

Figure 1 for Learning Visual Shape Control of Novel 3D Deformable Objects from Partial-View Point Clouds

Figure 2 for Learning Visual Shape Control of Novel 3D Deformable Objects from Partial-View Point Clouds

Figure 3 for Learning Visual Shape Control of Novel 3D Deformable Objects from Partial-View Point Clouds

Figure 4 for Learning Visual Shape Control of Novel 3D Deformable Objects from Partial-View Point Clouds

Abstract:If robots could reliably manipulate the shape of 3D deformable objects, they could find applications in fields ranging from home care to warehouse fulfillment to surgical assistance. Analytic models of elastic, 3D deformable objects require numerous parameters to describe the potentially infinite degrees of freedom present in determining the object's shape. Previous attempts at performing 3D shape control rely on hand-crafted features to represent the object shape and require training of object-specific control models. We overcome these issues through the use of our novel DeformerNet neural network architecture, which operates on a partial-view point cloud of the object being manipulated and a point cloud of the goal shape to learn a low-dimensional representation of the object shape. This shape embedding enables the robot to learn to define a visual servo controller that provides Cartesian pose changes to the robot end-effector causing the object to deform towards its target shape. Crucially, we demonstrate both in simulation and on a physical robot that DeformerNet reliably generalizes to object shapes and material stiffness not seen during training and outperforms comparison methods for both the generic shape control and the surgical task of retraction.

* Submitted to IEEE Conference on Robotics and Automation (ICRA) 2022. 8 pages, 10 figures

Via

Access Paper or Ask Questions

DeformerNet: A Deep Learning Approach to 3D Deformable Object Manipulation

Jul 16, 2021

Bao Thach, Alan Kuntz, Tucker Hermans

Figure 1 for DeformerNet: A Deep Learning Approach to 3D Deformable Object Manipulation

Figure 2 for DeformerNet: A Deep Learning Approach to 3D Deformable Object Manipulation

Figure 3 for DeformerNet: A Deep Learning Approach to 3D Deformable Object Manipulation

Figure 4 for DeformerNet: A Deep Learning Approach to 3D Deformable Object Manipulation

Abstract:In this paper, we propose a novel approach to 3D deformable object manipulation leveraging a deep neural network called DeformerNet. Controlling the shape of a 3D object requires an effective state representation that can capture the full 3D geometry of the object. Current methods work around this problem by defining a set of feature points on the object or only deforming the object in 2D image space, which does not truly address the 3D shape control problem. Instead, we explicitly use 3D point clouds as the state representation and apply Convolutional Neural Network on point clouds to learn the 3D features. These features are then mapped to the robot end-effector's position using a fully-connected neural network. Once trained in an end-to-end fashion, DeformerNet directly maps the current point cloud of a deformable object, as well as a target point cloud shape, to the desired displacement in robot gripper position. In addition, we investigate the problem of predicting the manipulation point location given the initial and goal shape of the object.

* Published at RSS 2021 Workshop on Deformable Object Simulation in Robotics; received Honorable Mention for Best Paper Award

Via

Access Paper or Ask Questions