Abstract:Context-aware decision support in the operating room can foster surgical safety and efficiency by leveraging real-time feedback from surgical workflow analysis. Most existing works recognize surgical activities at a coarse-grained level, such as phases, steps or events, leaving out fine-grained interaction details about the surgical activity; yet those are needed for more helpful AI assistance in the operating room. Recognizing surgical actions as triplets of <instrument, verb, target> combination delivers comprehensive details about the activities taking place in surgical videos. This paper presents CholecTriplet2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. The challenge granted private access to the large-scale CholecT50 dataset, which is annotated with action triplet information. In this paper, we present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge. A total of 4 baseline methods from the challenge organizers and 19 new deep learning algorithms by competing teams are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%. This study also analyzes the significance of the results obtained by the presented approaches, performs a thorough methodological comparison between them, in-depth result analysis, and proposes a novel ensemble method for enhanced recognition. Our analysis shows that surgical workflow analysis is not yet solved, and also highlights interesting directions for future research on fine-grained surgical activity recognition which is of utmost importance for the development of AI in surgery.
Abstract:Tissue oxygenation and perfusion can be an indicator for organ viability during minimally invasive surgery, for example allowing real-time assessment of tissue perfusion and oxygen saturation. Multispectral imaging is an optical modality that can inspect tissue perfusion in wide field images without contact. In this paper, we present a novel, fast method for using RGB images for MSI, which while limiting the spectral resolution of the modality allows normal laparoscopic systems to be used. We exploit the discrete Haar decomposition to separate individual video frames into low pass and directional coefficients and we utilise a different multispectral estimation technique on each. The increase in speed is achieved by using fast Tikhonov regularisation on the directional coefficients and more accurate Bayesian estimation on the low pass component. The pipeline is implemented using a graphics processing unit (GPU) architecture and achieves a frame rate of approximately 15Hz. We validate the method on animal models and on human data captured using a da Vinci stereo laparoscope.