Abstract:Robotic-assisted surgeries benefit both surgeons and patients, however, surgeons frequently need to adjust the endoscopic camera to achieve good viewpoints. Simultaneously controlling the camera and the surgical instruments is impossible, and consequentially, these camera adjustments repeatedly interrupt the surgery. Autonomous camera control could help overcome this challenge, but most existing systems are reactive, e.g., by having the camera follow the surgical instruments. We propose a predictive approach for anticipating when camera movements will occur using artificial neural networks. We used the kinematic data of the surgical instruments, which were recorded during robotic-assisted surgical training on porcine models. We split the data into segments, and labeled each either as a segment that immediately precedes a camera movement, or one that does not. Due to the large class imbalance, we trained an ensemble of networks, each on a balanced sub-set of the training data. We found that the instruments' kinematic data can be used to predict when camera movements will occur, and evaluated the performance on different segment durations and ensemble sizes. We also studied how much in advance an upcoming camera movement can be predicted, and found that predicting a camera movement 0.25, 0.5, and 1 second before they occurred achieved 98%, 94%, and 84% accuracy relative to the prediction of an imminent camera movement. This indicates that camera movement events can be predicted early enough to leave time for computing and executing an autonomous camera movement and suggests that an autonomous camera controller for RAMIS may one day be feasible.