Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ayberk Acar

From Monocular Vision to Autonomous Action: Guiding Tumor Resection via 3D Reconstruction

Mar 20, 2025

Ayberk Acar, Mariana Smith, Lidia Al-Zogbi, Tanner Watts, Fangjie Li, Hao Li, Nural Yilmaz, Paul Maria Scheikl, Jesse F. d'Almeida, Susheela Sharma(+7 more)

Abstract:Surgical automation requires precise guidance and understanding of the scene. Current methods in the literature rely on bulky depth cameras to create maps of the anatomy, however this does not translate well to space-limited clinical applications. Monocular cameras are small and allow minimally invasive surgeries in tight spaces but additional processing is required to generate 3D scene understanding. We propose a 3D mapping pipeline that uses only RGB images to create segmented point clouds of the target anatomy. To ensure the most precise reconstruction, we compare different structure from motion algorithms' performance on mapping the central airway obstructions, and test the pipeline on a downstream task of tumor resection. In several metrics, including post-procedure tissue model evaluation, our pipeline performs comparably to RGB-D cameras and, in some cases, even surpasses their performance. These promising results demonstrate that automation guidance can be achieved in minimally invasive procedures with monocular cameras. This study is a step toward the complete autonomy of surgical robots.

* 7 Pages, 8 Figures, 1 Table. This work has been submitted IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) for possible publication

Via

Access Paper or Ask Questions

An Effectiveness Study Across Baseline and Neural Network-based Force Estimation Methods on the da Vinci Research Kit Si System

May 13, 2024

Hao Yang, Ayberk Acar, Keshuai Xu, Anton Deguet, Peter Kazanzides, Jie Ying Wu

Abstract:In this study, we further investigate the robustness and generalization ability of an neural network (NN) based force estimation method, using the da Vinci Research Kit Si (dVRK-Si). To evaluate our method's performance, we compare the force estimation accuracy with several baseline methods. We conduct comparative studies between the dVRK classic and dVRK-Si systems to benchmark the effectiveness of these approaches. We conclude that the NN-based method provides comparable force estimation accuracy across the two systems, as the average root mean square error (RMSE) over the average range of force ratio is approximately 3.07% for the dVRK classic, and 5.27% for the dVRK-Si. On the dVRK-Si, the force estimation RMSEs for all the baseline methods are 2 to 4 times larger than the NN-based method in all directions. One possible reason is, we made assumptions in the baseline methods that static forces remain the same or dynamics is time-invariant. These assumptions may hold for the dVRK Classic, as it has pre-loaded weight and maintains horizontal self balance. Since the dVRK-Si configuration does not have this property, assumptions do not hold anymore, therefore the NN-based method significantly outperforms.

* Accepted by the Hamlyn Symposium on Medical Robotics 2024

Via

Access Paper or Ask Questions

MeshBrush: Painting the Anatomical Mesh with Neural Stylization for Endoscopy

Apr 03, 2024

John J. Han, Ayberk Acar, Nicholas Kavoussi, Jie Ying Wu

Figure 1 for MeshBrush: Painting the Anatomical Mesh with Neural Stylization for Endoscopy

Figure 2 for MeshBrush: Painting the Anatomical Mesh with Neural Stylization for Endoscopy

Figure 3 for MeshBrush: Painting the Anatomical Mesh with Neural Stylization for Endoscopy

Figure 4 for MeshBrush: Painting the Anatomical Mesh with Neural Stylization for Endoscopy

Abstract:Style transfer is a promising approach to close the sim-to-real gap in medical endoscopy. Rendering realistic endoscopic videos by traversing pre-operative scans (such as MRI or CT) can generate realistic simulations as well as ground truth camera poses and depth maps. Although image-to-image (I2I) translation models such as CycleGAN perform well, they are unsuitable for video-to-video synthesis due to the lack of temporal consistency, resulting in artifacts between frames. We propose MeshBrush, a neural mesh stylization method to synthesize temporally consistent videos with differentiable rendering. MeshBrush uses the underlying geometry of patient imaging data while leveraging existing I2I methods. With learned per-vertex textures, the stylized mesh guarantees consistency while producing high-fidelity outputs. We demonstrate that mesh stylization is a promising approach for creating realistic simulations for downstream tasks such as training and preoperative planning. Although our method is tested and designed for ureteroscopy, its components are transferable to general endoscopic and laparoscopic procedures.

* 10 pages, 5 figures

Via

Access Paper or Ask Questions

Depth Anything in Medical Images: A Comparative Study

Jan 29, 2024

John J. Han, Ayberk Acar, Callahan Henry, Jie Ying Wu

Abstract:Monocular depth estimation (MDE) is a critical component of many medical tracking and mapping algorithms, particularly from endoscopic or laparoscopic video. However, because ground truth depth maps cannot be acquired from real patient data, supervised learning is not a viable approach to predict depth maps for medical scenes. Although self-supervised learning for MDE has recently gained attention, the outputs are difficult to evaluate reliably and each MDE's generalizability to other patients and anatomies is limited. This work evaluates the zero-shot performance of the newly released Depth Anything Model on medical endoscopic and laparoscopic scenes. We compare the accuracy and inference speeds of Depth Anything with other MDE models trained on general scenes as well as in-domain models trained on endoscopic data. Our findings show that although the zero-shot capability of Depth Anything is quite impressive, it is not necessarily better than other models in both speed and performance. We hope that this study can spark further research in employing foundation models for MDE in medical scenes.

* 10 pages, 2 figures, 3 tables

Via

Access Paper or Ask Questions