Abstract:Intra-cardiac Echocardiography (ICE) is a crucial imaging modality used in electrophysiology (EP) and structural heart disease (SHD) interventions, providing real-time, high-resolution views from within the heart. Despite its advantages, effective manipulation of the ICE catheter requires significant expertise, which can lead to inconsistent outcomes, particularly among less experienced operators. To address this challenge, we propose an AI-driven closed-loop view guidance system with human-in-the-loop feedback, designed to assist users in navigating ICE imaging without requiring specialized knowledge. Our method models the relative position and orientation vectors between arbitrary views and clinically defined ICE views in a spatial coordinate system, guiding users on how to manipulate the ICE catheter to transition from the current view to the desired view over time. Operating in a closed-loop configuration, the system continuously predicts and updates the necessary catheter manipulations, ensuring seamless integration into existing clinical workflows. The effectiveness of the proposed system is demonstrated through a simulation-based evaluation, achieving an 89% success rate with the 6532 test dataset, highlighting its potential to improve the accuracy and efficiency of ICE imaging procedures.
Abstract:Despite recent developments in CT planning that enabled automation in patient positioning, time-consuming scout scans are still needed to compute dose profile and ensure the patient is properly positioned. In this paper, we present a novel method which eliminates the need for scout scans in CT lung cancer screening by estimating patient scan range, isocenter, and Water Equivalent Diameter (WED) from 3D camera images. We achieve this task by training an implicit generative model on over 60,000 CT scans and introduce a novel approach for updating the prediction using real-time scan data. We demonstrate the effectiveness of our method on a testing set of 110 pairs of depth data and CT scan, resulting in an average error of 5mm in estimating the isocenter, 13mm in determining the scan range, 10mm and 16mm in estimating the AP and lateral WED respectively. The relative WED error of our method is 4%, which is well within the International Electrotechnical Commission (IEC) acceptance criteria of 10%.
Abstract:Endovascular guidewire manipulation is essential for minimally-invasive clinical applications (Percutaneous Coronary Intervention (PCI), Mechanical thrombectomy techniques for acute ischemic stroke (AIS), or Transjugular intrahepatic portosystemic shunt (TIPS)). All procedures commonly require 3D vessel geometries from 3D CTA (Computed Tomography Angiography) images. During these procedures, the clinician generally places a guiding catheter in the ostium of the relevant vessel and then manipulates a wire through the catheter and across the blockage. The clinician only uses X-ray fluoroscopy intermittently to visualize and guide the catheter, guidewire, and other devices. However, clinicians still passively control guidewires/catheters by relying on limited indirect observation (i.e., 2D partial view of devices, and intermittent updates due to radiation limit) from X-ray fluoroscopy. Modeling and controlling the guidewire manipulation in coronary vessels remains challenging because of the complicated interaction between guidewire motions with different physical properties (i.e., loads, coating) and vessel geometries with lumen conditions resulting in a highly non-linear system. This paper introduces a scalable learning pipeline to train AI-based agent models toward automated endovascular predictive device controls. First, we create a scalable environment by pre-processing 3D CTA images, providing patient-specific 3D vessel geometry and the centerline of the coronary. Next, we apply a large quantity of randomly generated motion sequences from the proximal end to generate wire states associated with each environment using a physics-based device simulator. Then, we reformulate the control problem to a sequence-to-sequence learning problem, in which we use a Transformer-based model, trained to handle non-linear sequential forward/inverse transition functions.
Abstract:Purpose: Intra-Cardiac Echocardiography (ICE) is a powerful imaging modality for guiding cardiac electrophysiology and structural heart interventions. ICE provides real-time observation of anatomy and devices, while enabling direct monitoring of potential complications. In single operator settings, the physician needs to switch back-and-forth between the ICE catheter and therapy device, making continuous ICE support impossible. Two operators setup are therefore sometimes implemented, with the challenge of increase room occupation and cost. Two operator setups are sometimes implemented, but increase procedural costs and room occupation. Methods: ICE catheter robotic control system is developed with automated catheter tip repositioning (i.e. view recovery) method, which can reproduce important views previously navigated to and saved by the user. The performance of the proposed method is demonstrated and evaluated in a combination of heart phantom and animal experiments. Results: Automated ICE view recovery achieved catheter tip position accuracy of 2.09 +/-0.90 mm and catheter image orientation accuracy of 3.93 +/- 2.07 degree in animal studies, and 0.67 +/- 0.79 mm and 0.37 +/- 0.19 degree in heart phantom studies, respectively. Our proposed method is also successfully used during transeptal puncture in animals without complications, showing the possibility for fluoro-less transeptal puncture with ICE catheter robot. Conclusion: Robotic ICE imaging has the potential to provide precise and reproducible anatomical views, which can reduce overall execution time, labor burden of procedures, and x-ray usage for a range of cardiac procedures. Keywords: Automated View Recovery, Path Planning, Intra-cardiac echocardiography (ICE), Catheter, Tendon-driven manipulator, Cardiac Imaging
Abstract:Position sensitive detectors (PSDs) offer possibility to track single active marker's two (or three) degrees of freedom (DoF) position with a high accuracy, while having a fast response time with high update frequency and low latency, all using a very simple signal processing circuit. However they are not particularly suitable for 6-DoF object pose tracking system due to lack of orientation measurement, limited tracking range, and sensitivity to environmental variation. We propose a novel 6-DoF pose tracking system for a rigid object tracking requiring a single active marker. The proposed system uses a stereo-based PSD pair and multiple Inertial Measurement Units (IMUs). This is done based on a practical approach to identify and control the power of Infrared-Light Emitting Diode (IR-LED) active markers, with an aim to increase the tracking work space and reduce the power consumption. Our proposed tracking system is validated with three different work space sizes and for static and dynamic positional accuracy using robotic arm manipulator with three different dynamic motion patterns. The results show that the static position root-mean-square (RMS) error is 0.6mm. The dynamic position RMS error is 0.7-0.9mm. The orientation RMS error is between 0.04 and 0.9 degree at varied dynamic motion. Overall, our proposed tracking system is capable of tracking a rigid object pose with sub-millimeter accuracy at the mid range of the work space and sub-degree accuracy for all work space under a lab setting.
Abstract:Tendon-sheath-driven manipulators (TSM) are widely used in minimally invasive surgical systems due to their long, thin shape, flexibility, and compliance making them easily steerable in narrow or tortuous environments. Many commercial TSM-based medical devices have non-linear phenomena resulting from their composition such as backlash and dead zone hysteresis, which lead to a considerable challenge for achieving precise control of the end effector pose. However, many recent works in the literature do not consider the combined effects and compensation of these phenomena, and less focus on practical ways to identify model parameters in real field. In this paper, we propose a simplified piece-wise linear model to compensate both backlash and dead zone hysteresis together. Further, we introduce a practical method to identify model parameters using motor current from a robotic controller for the TSM. We analyze our proposed methods with multiple Intra-cardiac Echocardiography catheters, which are typical commercial example of TSM. Our results show that the errors from backlash and dead zone hysteresis are considerably reduced and therefore the accuracy of robotic control is improved when applying the presented methods.
Abstract:Intra-cardiac Echocardiography (ICE) has been evolving as a real-time imaging modality of choice for guiding electrophiosology and structural heart interventions. ICE provides real-time imaging of anatomy, catheters, and complications such as pericardial effusion or thrombus formation. However, there now exists a high cognitive demand on physicians with the increased reliance on intraprocedural imaging. In response, we present a robotic manipulator for AcuNav ICE catheters to alleviate the physician's burden and support applied methods for more automated. Herein, we introduce two methods towards these goals: (1) a data-driven method to compensate kinematic model errors due to non-linear elasticity in catheter bending, providing more precise robotic control and (2) an automated image recovery process that allows physicians to bookmark images during intervention and automatically return with the push of a button. To validate our error compensation method, we demonstrate a complex rotation of the ultrasound imaging plane evaluated on benchtop. Automated view recovery is validated by repeated imaging of landmarks on benchtop and in vivo experiments with position- and image-based analysis. Results support that a robotic-assist system for more autonomous ICE can provide a safe and efficient tool, potentially reducing the execution time and allowing more complex procedures to become common place.
Abstract:Point cloud based methods have produced promising results in areas such as 3D object detection in autonomous driving. However, most of the recent point cloud work focuses on single depth sensor data, whereas less work has been done on indoor monitoring applications, such as operation room monitoring in hospitals or indoor surveillance. In these scenarios multiple cameras are often used to tackle occlusion problems. We propose an end-to-end multi-person 3D pose estimation network, Point R-CNN, using multiple point cloud sources. We conduct extensive experiments to simulate challenging real world cases, such as individual camera failures, various target appearances, and complex cluttered scenes with the CMU panoptic dataset and the MVOR operation room dataset. Unlike most of the previous methods that attempt to use multiple sensor information by building complex fusion models, which often lead to poor generalization, we take advantage of the efficiency of concatenating point clouds to fuse the information at the input level. In the meantime, we show our end-to-end network greatly outperforms cascaded state-of-the-art models.
Abstract:Landmark localization is a challenging problem in computer vision with a multitude of applications. Recent deep learning based methods have shown improved results by regressing likelihood maps instead of regressing the coordinates directly. However, setting the precision of these regression targets during the training is a cumbersome process since it creates a trade-off between trainability vs localization accuracy. Using precise targets introduces a significant sampling bias and hence makes the training more difficult, whereas using imprecise targets results in inaccurate landmark detectors. In this paper, we introduce "Adaloss", an objective function that adapts itself during the training by updating the target precision based on the training statistics. This approach does not require setting problem-specific parameters and shows improved stability in training and better localization accuracy during inference. We demonstrate the effectiveness of our proposed method in three different applications of landmark localization: 1) the challenging task of precisely detecting catheter tips in medical X-ray images, 2) localizing surgical instruments in endoscopic images, and 3) localizing facial features on in-the-wild images where we show state-of-the-art results on the 300-W benchmark dataset.
Abstract:Automatic delineation and measurement of main organs such as liver is one of the critical steps for assessment of hepatic diseases, planning and postoperative or treatment follow-up. However, addressing this problem typically requires performing computed tomography (CT) scanning and complicated postprocessing of the resulting scans using slice-by-slice techniques. In this paper, we show that 3D organ shape can be automatically predicted directly from topogram images, which are easier to acquire and have limited exposure to radiation during acquisition, compared to CT scans. We evaluate our approach on the challenging task of predicting liver shape using a generative model. We also demonstrate that our method can be combined with user annotations, such as a 2D mask, for improved prediction accuracy. We show compelling results on 3D liver shape reconstruction and volume estimation on 2129 CT scans.