Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Relevance Detection in Cataract Surgery Videos by Spatio-Temporal Action Localization

Apr 29, 2021

Negin Ghamsarian, Mario Taschwer, Doris Putzgruber-Adamitsch, Stephanie Sarny, Klaus Schoeffmann

Figure 1 for Relevance Detection in Cataract Surgery Videos by Spatio-Temporal Action Localization

Figure 2 for Relevance Detection in Cataract Surgery Videos by Spatio-Temporal Action Localization

Figure 3 for Relevance Detection in Cataract Surgery Videos by Spatio-Temporal Action Localization

Figure 4 for Relevance Detection in Cataract Surgery Videos by Spatio-Temporal Action Localization

Share this with someone who'll enjoy it:

Abstract:In cataract surgery, the operation is performed with the help of a microscope. Since the microscope enables watching real-time surgery by up to two people only, a major part of surgical training is conducted using the recorded videos. To optimize the training procedure with the video content, the surgeons require an automatic relevance detection approach. In addition to relevance-based retrieval, these results can be further used for skill assessment and irregularity detection in cataract surgery videos. In this paper, a three-module framework is proposed to detect and classify the relevant phase segments in cataract videos. Taking advantage of an idle frame recognition network, the video is divided into idle and action segments. To boost the performance in relevance detection, the cornea where the relevant surgical actions are conducted is detected in all frames using Mask R-CNN. The spatiotemporally localized segments containing higher-resolution information about the pupil texture and actions, and complementary temporal information from the same phase are fed into the relevance detection module. This module consists of four parallel recurrent CNNs being responsible to detect four relevant phases that have been defined with medical experts. The results will then be integrated to classify the action phases as irrelevant or one of four relevant phases. Experimental results reveal that the proposed approach outperforms static CNNs and different configurations of feature-based and end-to-end recurrent networks.

* 8 pages, 4 figures, accepted at 5th International Conference on Pattern Recognition (ICPR), Milan, Italy, 2020

View paper on

Share this with someone who'll enjoy it:

Title:Relevance Detection in Cataract Surgery Videos by Spatio-Temporal Action Localization

Paper and Code