Abstract:In Monocular Keyframe Visual Simultaneous Localization and Mapping (MKVSLAM) frameworks, when incremental position tracking fails, global pose has to be recovered in a short-time window, also known as short-term relocalization. This capability is crucial for mobile robots to have reliable navigation, build accurate maps, and have precise behaviors around human collaborators. This paper focuses on the development of robust short-term relocalization capabilities for mobile robots using a monocular camera system. A novel multimodal keyframe descriptor is introduced, that contains semantic information of objects detected in the environment and the spatial information of the camera. Using this descriptor, a new Keyframe-based Place Recognition (KPR) method is proposed that is formulated as a multi-stage keyframe filtering algorithm, leading to a new relocalization pipeline for MKVSLAM systems. The proposed approach is evaluated over several indoor GPS denied datasets and demonstrates accurate pose recovery, in comparison to a bag-of-words approach.