Abstract:Surgical domain models improve workflow optimization through automated predictions of each staff member's surgical role. However, mounting evidence indicates that team familiarity and individuality impact surgical outcomes. We present a novel staff-centric modeling approach that characterizes individual team members through their distinctive movement patterns and physical characteristics, enabling long-term tracking and analysis of surgical personnel across multiple procedures. To address the challenge of inter-clinic variability, we develop a generalizable re-identification framework that encodes sequences of 3D point clouds to capture shape and articulated motion patterns unique to each individual. Our method achieves 86.19% accuracy on realistic clinical data while maintaining 75.27% accuracy when transferring between different environments - a 12% improvement over existing methods. When used to augment markerless personnel tracking, our approach improves accuracy by over 50%. Through extensive validation across three datasets and the introduction of a novel workflow visualization technique, we demonstrate how our framework can reveal novel insights into surgical team dynamics and space utilization patterns, advancing methods to analyze surgical workflows and team coordination.
Abstract:Purpose: Recent advances in Surgical Data Science (SDS) have contributed to an increase in video recordings from hospital environments. While methods such as surgical workflow recognition show potential in increasing the quality of patient care, the quantity of video data has surpassed the scale at which images can be manually anonymized. Existing automated 2D anonymization methods under-perform in Operating Rooms (OR), due to occlusions and obstructions. We propose to anonymize multi-view OR recordings using 3D data from multiple camera streams. Methods: RGB and depth images from multiple cameras are fused into a 3D point cloud representation of the scene. We then detect each individual's face in 3D by regressing a parametric human mesh model onto detected 3D human keypoints and aligning the face mesh with the fused 3D point cloud. The mesh model is rendered into every acquired camera view, replacing each individual's face. Results: Our method shows promise in locating faces at a higher rate than existing approaches. DisguisOR produces geometrically consistent anonymizations for each camera view, enabling more realistic anonymization that is less detrimental to downstream tasks. Conclusion: Frequent obstructions and crowding in operating rooms leaves significant room for improvement for off-the-shelf anonymization methods. DisguisOR addresses privacy on a scene level and has the potential to facilitate further research in SDS.