Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Haoying Zhou

SurgPose: a Dataset for Articulated Robotic Surgical Tool Pose Estimation and Tracking

Feb 17, 2025

Zijian Wu, Adam Schmidt, Randy Moore, Haoying Zhou, Alexandre Banks, Peter Kazanzides, Septimiu E. Salcudean

Abstract:Accurate and efficient surgical robotic tool pose estimation is of fundamental significance to downstream applications such as augmented reality (AR) in surgical training and learning-based autonomous manipulation. While significant advancements have been made in pose estimation for humans and animals, it is still a challenge in surgical robotics due to the scarcity of published data. The relatively large absolute error of the da Vinci end effector kinematics and arduous calibration procedure make calibrated kinematics data collection expensive. Driven by this limitation, we collected a dataset, dubbed SurgPose, providing instance-aware semantic keypoints and skeletons for visual surgical tool pose estimation and tracking. By marking keypoints using ultraviolet (UV) reactive paint, which is invisible under white light and fluorescent under UV light, we execute the same trajectory under different lighting conditions to collect raw videos and keypoint annotations, respectively. The SurgPose dataset consists of approximately 120k surgical instrument instances (80k for training and 40k for validation) of 6 categories. Each instrument instance is labeled with 7 semantic keypoints. Since the videos are collected in stereo pairs, the 2D pose can be lifted to 3D based on stereo-matching depth. In addition to releasing the dataset, we test a few baseline approaches to surgical instrument tracking to demonstrate the utility of SurgPose. More details can be found at surgpose.github.io.

* Accepted by ICRA 2025

Via

Access Paper or Ask Questions

Gravity Compensation of the dVRK-Si Patient Side Manipulator based on Dynamic Model Identification

Jan 31, 2025

Haoying Zhou, Hao Yang, Anton Deguet, Loris Fichera, Jie Ying Wu, Peter Kazanzides

Abstract:The da Vinci Research Kit (dVRK, also known as dVRK Classic) is an open-source teleoperated surgical robotic system whose hardware is obtained from the first generation da Vinci Surgical System (Intuitive, Sunnyvale, CA, USA). The dVRK has greatly facilitated research in robot-assisted surgery over the past decade and helped researchers address multiple major challenges in this domain. Recently, the dVRK-Si system, a new version of the dVRK which uses mechanical components from the da Vinci Si Surgical System, became available to the community. The major difference between the first generation da Vinci and the da Vinci Si is in the structural upgrade of the Patient Side Manipulator (PSM). Because of this upgrade, the gravity of the dVRK-Si PSM can no longer be ignored as in the dVRK Classic. The high gravity offset may lead to relatively low control accuracy and longer response time. In addition, although substantial progress has been made in addressing the dynamic model identification problem for the dVRK Classic, further research is required on model-based control for the dVRK-Si, due to differences in mechanical components and the demand for enhanced control performance. To address these problems, in this work, we present (1) a novel full kinematic model of the dVRK-Si PSM, and (2) a gravity compensation approach based on the dynamic model identification.

Via

Access Paper or Ask Questions

Towards a Modern and Lightweight Rendering Engine for Dynamic Robotic Simulations

Oct 07, 2024

Christopher John Allison, Haoying Zhou, Adnan Munawar, Peter Kazanzides, Juan Antonio Barragan

Figure 1 for Towards a Modern and Lightweight Rendering Engine for Dynamic Robotic Simulations

Figure 2 for Towards a Modern and Lightweight Rendering Engine for Dynamic Robotic Simulations

Figure 3 for Towards a Modern and Lightweight Rendering Engine for Dynamic Robotic Simulations

Figure 4 for Towards a Modern and Lightweight Rendering Engine for Dynamic Robotic Simulations

Abstract:Interactive dynamic simulators are an accelerator for developing novel robotic control algorithms and complex systems involving humans and robots. In user training and synthetic data generation applications, a high-fidelity visualization of the simulation is essential. Visual fidelity is dependent on the quality of the computer graphics algorithms used to render the simulated scene. Furthermore, the rendering algorithms must be implemented on the graphics processing unit (GPU) to achieve real-time performance, requiring the use of a graphics application programming interface (API). This paper presents a performance-focused and lightweight rendering engine supporting the Vulkan graphics API. The engine is designed to modernize the legacy rendering pipeline of Asynchronous Multi-Body Framework (AMBF), a dynamic simulation framework used extensively for interactive robotics simulation development. This new rendering engine implements graphical features such as physically based rendering (PBR), anti-aliasing, and ray-traced shadows, significantly improving the image quality of AMBF. Computational experiments show that the engine can render a simulated scene with over seven million triangles while maintaining GPU computation times within two milliseconds.

* 8 pages, 8 figures, submitted to the 2024 IEEE International Conference on Robotic Computing (IRC)

Via

Access Paper or Ask Questions

A Hybrid Model and Learning-Based Force Estimation Framework for Surgical Robots

Sep 30, 2024

Hao Yang, Haoying Zhou, Gregory S. Fischer, Jie Ying Wu

Figure 1 for A Hybrid Model and Learning-Based Force Estimation Framework for Surgical Robots

Figure 2 for A Hybrid Model and Learning-Based Force Estimation Framework for Surgical Robots

Figure 3 for A Hybrid Model and Learning-Based Force Estimation Framework for Surgical Robots

Figure 4 for A Hybrid Model and Learning-Based Force Estimation Framework for Surgical Robots

Abstract:Haptic feedback to the surgeon during robotic surgery would enable safer and more immersive surgeries but estimating tissue interaction forces at the tips of robotically controlled surgical instruments has proven challenging. Few existing surgical robots can measure interaction forces directly and the additional sensor may limit the life of instruments. We present a hybrid model and learning-based framework for force estimation for the Patient Side Manipulators (PSM) of a da Vinci Research Kit (dVRK). The model-based component identifies the dynamic parameters of the robot and estimates free-space joint torque, while the learning-based component compensates for environmental factors, such as the additional torque caused by trocar interaction between the PSM instrument and the patient's body wall. We evaluate our method in an abdominal phantom and achieve an error in force estimation of under 10% normalized root-mean-squared error. We show that by using a model-based method to perform dynamics identification, we reduce reliance on the training data covering the entire workspace. Although originally developed for the dVRK, the proposed method is a generalizable framework for other compliant surgical robots. The code is available at https://github.com/vu-maple-lab/dvrk_force_estimation.

* Accepted by IROS 2024

Via

Access Paper or Ask Questions

Loss Distillation via Gradient Matching for Point Cloud Completion with Weighted Chamfer Distance

Sep 10, 2024

Fangzhou Lin, Haotian Liu, Haoying Zhou, Songlin Hou, Kazunori D Yamada, Gregory S. Fischer, Yanhua Li, Haichong K. Zhang, Ziming Zhang

Figure 1 for Loss Distillation via Gradient Matching for Point Cloud Completion with Weighted Chamfer Distance

Figure 2 for Loss Distillation via Gradient Matching for Point Cloud Completion with Weighted Chamfer Distance

Figure 3 for Loss Distillation via Gradient Matching for Point Cloud Completion with Weighted Chamfer Distance

Figure 4 for Loss Distillation via Gradient Matching for Point Cloud Completion with Weighted Chamfer Distance

Abstract:3D point clouds enhanced the robot's ability to perceive the geometrical information of the environments, making it possible for many downstream tasks such as grasp pose detection and scene understanding. The performance of these tasks, though, heavily relies on the quality of data input, as incomplete can lead to poor results and failure cases. Recent training loss functions designed for deep learning-based point cloud completion, such as Chamfer distance (CD) and its variants (\eg HyperCD ), imply a good gradient weighting scheme can significantly boost performance. However, these CD-based loss functions usually require data-related parameter tuning, which can be time-consuming for data-extensive tasks. To address this issue, we aim to find a family of weighted training losses ({\em weighted CD}) that requires no parameter tuning. To this end, we propose a search scheme, {\em Loss Distillation via Gradient Matching}, to find good candidate loss functions by mimicking the learning behavior in backpropagation between HyperCD and weighted CD. Once this is done, we propose a novel bilevel optimization formula to train the backbone network based on the weighted CD loss. We observe that: (1) with proper weighted functions, the weighted CD can always achieve similar performance to HyperCD, and (2) the Landau weighted CD, namely {\em Landau CD}, can outperform HyperCD for point cloud completion and lead to new state-of-the-art results on several benchmark datasets. {\it Our demo code is available at \url{https://github.com/Zhang-VISLab/IROS2024-LossDistillationWeightedCD}.}

* 10 pages, 7 figures, 7 tables, this paper was accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2024

Via

Access Paper or Ask Questions

SurgicAI: A Fine-grained Platform for Data Collection and Benchmarking in Surgical Policy Learning

Jun 19, 2024

Jin Wu, Haoying Zhou, Peter Kazanzides, Adnan Munawar, Anqi Liu

Figure 1 for SurgicAI: A Fine-grained Platform for Data Collection and Benchmarking in Surgical Policy Learning

Figure 2 for SurgicAI: A Fine-grained Platform for Data Collection and Benchmarking in Surgical Policy Learning

Figure 3 for SurgicAI: A Fine-grained Platform for Data Collection and Benchmarking in Surgical Policy Learning

Figure 4 for SurgicAI: A Fine-grained Platform for Data Collection and Benchmarking in Surgical Policy Learning

Abstract:Despite advancements in robotic-assisted surgery, automating complex tasks like suturing remain challenging due to the need for adaptability and precision. Learning-based approaches, particularly reinforcement learning (RL) and imitation learning (IL), require realistic simulation environments for efficient data collection. However, current platforms often include only relatively simple, non-dexterous manipulations and lack the flexibility required for effective learning and generalization. We introduce SurgicAI, a novel platform for development and benchmarking addressing these challenges by providing the flexibility to accommodate both modular subtasks and more importantly task decomposition in RL-based surgical robotics. Compatible with the da Vinci Surgical System, SurgicAI offers a standardized pipeline for collecting and utilizing expert demonstrations. It supports deployment of multiple RL and IL approaches, and the training of both singular and compositional subtasks in suturing scenarios, featuring high dexterity and modularization. Meanwhile, SurgicAI sets clear metrics and benchmarks for the assessment of learned policies. We implemented and evaluated multiple RL and IL algorithms on SurgicAI. Our detailed benchmark analysis underscores SurgicAI's potential to advance policy learning in surgical robotics. Details: \url{https://github.com/surgical-robotics-ai/SurgicAI

Via

Access Paper or Ask Questions

Realistic Data Generation for 6D Pose Estimation of Surgical Instruments

Jun 11, 2024

Juan Antonio Barragan, Jintan Zhang, Haoying Zhou, Adnan Munawar, Peter Kazanzides

Figure 1 for Realistic Data Generation for 6D Pose Estimation of Surgical Instruments

Figure 2 for Realistic Data Generation for 6D Pose Estimation of Surgical Instruments

Figure 3 for Realistic Data Generation for 6D Pose Estimation of Surgical Instruments

Figure 4 for Realistic Data Generation for 6D Pose Estimation of Surgical Instruments

Abstract:Automation in surgical robotics has the potential to improve patient safety and surgical efficiency, but it is difficult to achieve due to the need for robust perception algorithms. In particular, 6D pose estimation of surgical instruments is critical to enable the automatic execution of surgical maneuvers based on visual feedback. In recent years, supervised deep learning algorithms have shown increasingly better performance at 6D pose estimation tasks; yet, their success depends on the availability of large amounts of annotated data. In household and industrial settings, synthetic data, generated with 3D computer graphics software, has been shown as an alternative to minimize annotation costs of 6D pose datasets. However, this strategy does not translate well to surgical domains as commercial graphics software have limited tools to generate images depicting realistic instrument-tissue interactions. To address these limitations, we propose an improved simulation environment for surgical robotics that enables the automatic generation of large and diverse datasets for 6D pose estimation of surgical instruments. Among the improvements, we developed an automated data generation pipeline and an improved surgical scene. To show the applicability of our system, we generated a dataset of 7.5k images with pose annotations of a surgical needle that was used to evaluate a state-of-the-art pose estimation network. The trained model obtained a mean translational error of 2.59mm on a challenging dataset that presented varying levels of occlusion. These results highlight our pipeline's success in training and evaluating novel vision algorithms for surgical robotics applications.

* 2024 IEEE International Conference on Robotics and Automation (ICRA)
* 6 pages

Via

Access Paper or Ask Questions

Suturing Tasks Automation Based on Skills Learned From Demonstrations: A Simulation Study

Mar 01, 2024

Haoying Zhou, Yiwei Jiang, Shang Gao, Shiyue Wang, Peter Kazanzides, Gregory S. Fischer

Abstract:In this work, we develop an open-source surgical simulation environment that includes a realistic model obtained by MRI-scanning a physical phantom, for the purpose of training and evaluating a Learning from Demonstration (LfD) algorithm for autonomous suturing. The LfD algorithm utilizes Dynamic Movement Primitives (DMP) and Locally Weighted Regression (LWR), but focuses on the needle trajectory, rather than the instruments, to obtain better generality with respect to needle grasps. We conduct a user study to collect multiple suturing demonstrations and perform a comprehensive analysis of the ability of the LfD algorithm to generalize from a demonstration at one location in one phantom to different locations in the same phantom and to a different phantom. Our results indicate good generalization, on the order of 91.5%, when learning from more experienced subjects, indicating the need to integrate skill assessment in the future.

Via

Access Paper or Ask Questions