Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:TUNeS: A Temporal U-Net with Self-Attention for Video-based Surgical Phase Recognition

Jul 19, 2023

Isabel Funke, Dominik Rivoir, Stefanie Krell, Stefanie Speidel

Figure 1 for TUNeS: A Temporal U-Net with Self-Attention for Video-based Surgical Phase Recognition

Figure 2 for TUNeS: A Temporal U-Net with Self-Attention for Video-based Surgical Phase Recognition

Figure 3 for TUNeS: A Temporal U-Net with Self-Attention for Video-based Surgical Phase Recognition

Figure 4 for TUNeS: A Temporal U-Net with Self-Attention for Video-based Surgical Phase Recognition

Share this with someone who'll enjoy it:

Abstract:To enable context-aware computer assistance in the operating room of the future, cognitive systems need to understand automatically which surgical phase is being performed by the medical team. The primary source of information for surgical phase recognition is typically video, which presents two challenges: extracting meaningful features from the video stream and effectively modeling temporal information in the sequence of visual features. For temporal modeling, attention mechanisms have gained popularity due to their ability to capture long-range dependencies. In this paper, we explore design choices for attention in existing temporal models for surgical phase recognition and propose a novel approach that does not resort to local attention or regularization of attention weights: TUNeS is an efficient and simple temporal model that incorporates self-attention at the coarsest stage of a U-Net-like structure. In addition, we propose to train the feature extractor, a standard CNN, together with an LSTM on preferably long video segments, i.e., with long temporal context. In our experiments, all temporal models performed better on top of feature extractors that were trained with longer temporal context. On top of these contextualized features, TUNeS achieves state-of-the-art results on Cholec80.

View paper on

Share this with someone who'll enjoy it:

Title:TUNeS: A Temporal U-Net with Self-Attention for Video-based Surgical Phase Recognition

Paper and Code