Abstract:Improved surgical skill is generally associated with improved patient outcomes, although assessment is subjective; labour-intensive; and requires domain specific expertise. Automated data driven metrics can alleviate these difficulties, as demonstrated by existing machine learning instrument tracking models in minimally invasive surgery. However, these models have been tested on limited datasets of laparoscopic surgery, with a focus on isolated tasks and robotic surgery. In this paper, a new public dataset is introduced, focusing on simulated surgery, using the nasal phase of endoscopic pituitary surgery as an exemplar. Simulated surgery allows for a realistic yet repeatable environment, meaning the insights gained from automated assessment can be used by novice surgeons to hone their skills on the simulator before moving to real surgery. PRINTNet (Pituitary Real-time INstrument Tracking Network) has been created as a baseline model for this automated assessment. Consisting of DeepLabV3 for classification and segmentation; StrongSORT for tracking; and the NVIDIA Holoscan SDK for real-time performance, PRINTNet achieved 71.9% Multiple Object Tracking Precision running at 22 Frames Per Second. Using this tracking output, a Multilayer Perceptron achieved 87% accuracy in predicting surgical skill level (novice or expert), with the "ratio of total procedure time to instrument visible time" correlated with higher surgical skill. This therefore demonstrates the feasibility of automated surgical skill assessment in simulated endoscopic pituitary surgery. The new publicly available dataset can be found here: https://doi.org/10.5522/04/26511049.
Abstract:This paper presents a novel algorithm that registers a collection of mono-modal 3D images in a simultaneous fashion, named as Direct Simultaneous Registration (DSR). The algorithm optimizes global poses of local frames directly based on the intensities of images (without extracting features from the images). To obtain the optimal result, we start with formulating a Direct Bundle Adjustment (DBA) problem which jointly optimizes pose parameters of local frames and intensities of panoramic image. By proving the independence of the pose from panoramic image in the iterative process, DSR is proposed and proved to be able to generate the same optimal poses as DBA, but without optimizing the intensities of the panoramic image. The proposed DSR method is particularly suitable in mono-modal registration and in the scenarios where distinct features are not available, such as Transesophageal Echocardiography (TEE) images. The proposed method is validated via simulated and in-vivo 3D TEE images. It is shown that the proposed method outperforms conventional sequential registration method in terms of accuracy and the obtained results can produce good alignment in in-vivo images.