Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alan Kuntz

DefFusionNet: Learning Multimodal Goal Shapes for Deformable Object Manipulation via a Diffusion-based Probabilistic Model

Jun 23, 2025

Bao Thach, Siyeon Kim, Britton Jordan, Mohanraj Shanthi, Tanner Watts, Shing-Hei Ho, James M. Ferguson, Tucker Hermans, Alan Kuntz

Abstract:Deformable object manipulation is critical to many real-world robotic applications, ranging from surgical robotics and soft material handling in manufacturing to household tasks like laundry folding. At the core of this important robotic field is shape servoing, a task focused on controlling deformable objects into desired shapes. The shape servoing formulation requires the specification of a goal shape. However, most prior works in shape servoing rely on impractical goal shape acquisition methods, such as laborious domain-knowledge engineering or manual manipulation. DefGoalNet previously posed the current state-of-the-art solution to this problem, which learns deformable object goal shapes directly from a small number of human demonstrations. However, it significantly struggles in multi-modal settings, where multiple distinct goal shapes can all lead to successful task completion. As a deterministic model, DefGoalNet collapses these possibilities into a single averaged solution, often resulting in an unusable goal. In this paper, we address this problem by developing DefFusionNet, a novel neural network that leverages the diffusion probabilistic model to learn a distribution over all valid goal shapes rather than predicting a single deterministic outcome. This enables the generation of diverse goal shapes and avoids the averaging artifacts. We demonstrate our method's effectiveness on robotic tasks inspired by both manufacturing and surgical applications, both in simulation and on a physical robot. Our work is the first generative model capable of producing a diverse, multi-modal set of deformable object goals for real-world robotic applications.

Via

Access Paper or Ask Questions

From Monocular Vision to Autonomous Action: Guiding Tumor Resection via 3D Reconstruction

Mar 20, 2025

Ayberk Acar, Mariana Smith, Lidia Al-Zogbi, Tanner Watts, Fangjie Li, Hao Li, Nural Yilmaz, Paul Maria Scheikl, Jesse F. d'Almeida, Susheela Sharma(+7 more)

Abstract:Surgical automation requires precise guidance and understanding of the scene. Current methods in the literature rely on bulky depth cameras to create maps of the anatomy, however this does not translate well to space-limited clinical applications. Monocular cameras are small and allow minimally invasive surgeries in tight spaces but additional processing is required to generate 3D scene understanding. We propose a 3D mapping pipeline that uses only RGB images to create segmented point clouds of the target anatomy. To ensure the most precise reconstruction, we compare different structure from motion algorithms' performance on mapping the central airway obstructions, and test the pipeline on a downstream task of tumor resection. In several metrics, including post-procedure tissue model evaluation, our pipeline performs comparably to RGB-D cameras and, in some cases, even surpasses their performance. These promising results demonstrate that automation guidance can be achieved in minimally invasive procedures with monocular cameras. This study is a step toward the complete autonomy of surgical robots.

* 7 Pages, 8 Figures, 1 Table. This work has been submitted IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) for possible publication

Via

Access Paper or Ask Questions

Early Failure Detection in Autonomous Surgical Soft-Tissue Manipulation via Uncertainty Quantification

Jan 17, 2025

Jordan Thompson, Ronald Koe, Anthony Le, Gabriella Goodman, Daniel S. Brown, Alan Kuntz

Abstract:Autonomous surgical robots are a promising solution to the increasing demand for surgery amid a shortage of surgeons. Recent work has proposed learning-based approaches for the autonomous manipulation of soft tissue. However, due to variability in tissue geometries and stiffnesses, these methods do not always perform optimally, especially in out-of-distribution settings. We propose, develop, and test the first application of uncertainty quantification to learned surgical soft-tissue manipulation policies as an early identification system for task failures. We analyze two different methods of uncertainty quantification, deep ensembles and Monte Carlo dropout, and find that deep ensembles provide a stronger signal of future task success or failure. We validate our approach using the physical daVinci Research Kit (dVRK) surgical robot to perform physical soft-tissue manipulation. We show that we are able to successfully detect task failure and request human intervention when necessary while still enabling autonomous manipulation when possible. Our learned tissue manipulation policy with uncertainty-based early failure detection achieves a zero-shot sim2real performance improvement of 47.5% over the prior state of the art in learned soft-tissue manipulation. We also show that our method generalizes well to new types of tissue as well as to a bimanual soft tissue manipulation task.

* 8 pages, 6 figures

Via

Access Paper or Ask Questions

Leveraging Fixed-Parameter Tractability for Robot Inspection Planning

Jun 28, 2024

Yosuke Mizutani, Daniel Coimbra Salomao, Alex Crane, Matthias Bentert, Pål Grønås Drange, Felix Reidl, Alan Kuntz, Blair D. Sullivan

Figure 1 for Leveraging Fixed-Parameter Tractability for Robot Inspection Planning

Figure 2 for Leveraging Fixed-Parameter Tractability for Robot Inspection Planning

Figure 3 for Leveraging Fixed-Parameter Tractability for Robot Inspection Planning

Figure 4 for Leveraging Fixed-Parameter Tractability for Robot Inspection Planning

Abstract:Autonomous robotic inspection, where a robot moves through its environment and inspects points of interest, has applications in industrial settings, structural health monitoring, and medicine. Planning the paths for a robot to safely and efficiently perform such an inspection is an extremely difficult algorithmic challenge. In this work we consider an abstraction of the inspection planning problem which we term Graph Inspection. We give two exact algorithms for this problem, using dynamic programming and integer linear programming. We analyze the performance of these methods, and present multiple approaches to achieve scalability. We demonstrate significant improvement both in path weight and inspection coverage over a state-of-the-art approach on two robotics tasks in simulation, a bridge inspection task by a UAV and a surgical inspection task using a medical robot.

Via

Access Paper or Ask Questions

Reward Learning from Suboptimal Demonstrations with Applications in Surgical Electrocautery

Apr 10, 2024

Zohre Karimi, Shing-Hei Ho, Bao Thach, Alan Kuntz, Daniel S. Brown

Abstract:Automating robotic surgery via learning from demonstration (LfD) techniques is extremely challenging. This is because surgical tasks often involve sequential decision-making processes with complex interactions of physical objects and have low tolerance for mistakes. Prior works assume that all demonstrations are fully observable and optimal, which might not be practical in the real world. This paper introduces a sample-efficient method that learns a robust reward function from a limited amount of ranked suboptimal demonstrations consisting of partial-view point cloud observations. The method then learns a policy by optimizing the learned reward function using reinforcement learning (RL). We show that using a learned reward function to obtain a policy is more robust than pure imitation learning. We apply our approach on a physical surgical electrocautery task and demonstrate that our method can perform well even when the provided demonstrations are suboptimal and the observations are high-dimensional point clouds.

* In proceedings of the International Symposium on Medical Robotics (ISMR) 2024. Equal contribution from two first authors

Via

Access Paper or Ask Questions

Modeling Kinematic Uncertainty of Tendon-Driven Continuum Robots via Mixture Density Networks

Apr 05, 2024

Jordan Thompson, Brian Y. Cho, Daniel S. Brown, Alan Kuntz

Abstract:Tendon-driven continuum robot kinematic models are frequently computationally expensive, inaccurate due to unmodeled effects, or both. In particular, unmodeled effects produce uncertainties that arise during the robot's operation that lead to variability in the resulting geometry. We propose a novel solution to these issues through the development of a Gaussian mixture kinematic model. We train a mixture density network to output a Gaussian mixture model representation of the robot geometry given the current tendon displacements. This model computes a probability distribution that is more representative of the true distribution of geometries at a given configuration than a model that outputs a single geometry, while also reducing the computation time. We demonstrate one use of this model through a trajectory optimization method that explicitly reasons about the workspace uncertainty to minimize the probability of collision.

Via

Access Paper or Ask Questions

Accounting for Hysteresis in the Forward Kinematics of Nonlinearly-Routed Tendon-Driven Continuum Robots via a Learned Deep Decoder Network

Apr 04, 2024

Brian Y. Cho, Daniel S. Esser, Jordan Thompson, Bao Thach, Robert J. Webster III, Alan Kuntz

Abstract:Tendon-driven continuum robots have been gaining popularity in medical applications due to their ability to curve around complex anatomical structures, potentially reducing the invasiveness of surgery. However, accurate modeling is required to plan and control the movements of these flexible robots. Physics-based models have limitations due to unmodeled effects, leading to mismatches between model prediction and actual robot shape. Recently proposed learning-based methods have been shown to overcome some of these limitations but do not account for hysteresis, a significant source of error for these robots. To overcome these challenges, we propose a novel deep decoder neural network that predicts the complete shape of tendon-driven robots using point clouds as the shape representation, conditioned on prior configurations to account for hysteresis. We evaluate our method on a physical tendon-driven robot and show that our network model accurately predicts the robot's shape, significantly outperforming a state-of-the-art physics-based model and a learning-based model that does not account for hysteresis.

* 8 pages, 9 figures, Submitted to IEEE Robotics and Automation Letters

Via

Access Paper or Ask Questions

General-purpose foundation models for increased autonomy in robot-assisted surgery

Jan 01, 2024

Samuel Schmidgall, Ji Woong Kim, Alan Kuntz, Ahmed Ezzat Ghazi, Axel Krieger

Abstract:The dominant paradigm for end-to-end robot learning focuses on optimizing task-specific objectives that solve a single robotic problem such as picking up an object or reaching a target position. However, recent work on high-capacity models in robotics has shown promise toward being trained on large collections of diverse and task-agnostic datasets of video demonstrations. These models have shown impressive levels of generalization to unseen circumstances, especially as the amount of data and the model complexity scale. Surgical robot systems that learn from data have struggled to advance as quickly as other fields of robot learning for a few reasons: (1) there is a lack of existing large-scale open-source data to train models, (2) it is challenging to model the soft-body deformations that these robots work with during surgery because simulation cannot match the physical and visual complexity of biological tissue, and (3) surgical robots risk harming patients when tested in clinical trials and require more extensive safety measures. This perspective article aims to provide a path toward increasing robot autonomy in robot-assisted surgery through the development of a multi-modal, multi-task, vision-language-action model for surgical robots. Ultimately, we argue that surgical robots are uniquely positioned to benefit from general-purpose models and provide three guiding actions toward increased autonomy in robot-assisted surgery.

Via

Access Paper or Ask Questions

DefGoalNet: Contextual Goal Learning from Demonstrations For Deformable Object Manipulation

Sep 25, 2023

Bao Thach, Tanner Watts, Shing-Hei Ho, Tucker Hermans, Alan Kuntz

Figure 1 for DefGoalNet: Contextual Goal Learning from Demonstrations For Deformable Object Manipulation

Figure 2 for DefGoalNet: Contextual Goal Learning from Demonstrations For Deformable Object Manipulation

Figure 3 for DefGoalNet: Contextual Goal Learning from Demonstrations For Deformable Object Manipulation

Figure 4 for DefGoalNet: Contextual Goal Learning from Demonstrations For Deformable Object Manipulation

Abstract:Shape servoing, a robotic task dedicated to controlling objects to desired goal shapes, is a promising approach to deformable object manipulation. An issue arises, however, with the reliance on the specification of a goal shape. This goal has been obtained either by a laborious domain knowledge engineering process or by manually manipulating the object into the desired shape and capturing the goal shape at that specific moment, both of which are impractical in various robotic applications. In this paper, we solve this problem by developing a novel neural network DefGoalNet, which learns deformable object goal shapes directly from a small number of human demonstrations. We demonstrate our method's effectiveness on various robotic tasks, both in simulation and on a physical robot. Notably, in the surgical retraction task, even when trained with as few as 10 demonstrations, our method achieves a median success percentage of nearly 90%. These results mark a substantial advancement in enabling shape servoing methods to bring deformable object manipulation closer to practical, real-world applications.

* Submitted to IEEE Conference on Robotics and Automation (ICRA) 2024. 8 pages, 11 figures

Via

Access Paper or Ask Questions

Efficient and Accurate Mapping of Subsurface Anatomy via Online Trajectory Optimization for Robot Assisted Surgery

Sep 18, 2023

Brian Y. Cho, Alan Kuntz

Abstract:Robotic surgical subtask automation has the potential to reduce the per-patient workload of human surgeons. There are a variety of surgical subtasks that require geometric information of subsurface anatomy, such as the location of tumors, which necessitates accurate and efficient surgical sensing. In this work, we propose an automated sensing method that maps 3D subsurface anatomy to provide such geometric knowledge. We model the anatomy via a Bayesian Hilbert map-based probabilistic 3D occupancy map. Using the 3D occupancy map, we plan sensing paths on the surface of the anatomy via a graph search algorithm, $A^*$ search, with a cost function that enables the trajectories generated to balance between exploration of unsensed regions and refining the existing probabilistic understanding. We demonstrate the performance of our proposed method by comparing it against 3 different methods in several anatomical environments including a real-life CT scan dataset. The experimental results show that our method efficiently detects relevant subsurface anatomy with shorter trajectories than the comparison methods, and the resulting occupancy map achieves high accuracy.

* 7 pages, 6 figures, submitted to 2024 IEEE International Conference on Robotics and Automation (ICRA 2024)

Via

Access Paper or Ask Questions