Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:CLIP-Loc: Multi-modal Landmark Association for Global Localization in Object-based Maps

Feb 08, 2024

Shigemichi Matsuzaki, Takuma Sugino, Kazuhito Tanaka, Zijun Sha, Shintaro Nakaoka, Shintaro Yoshizawa, Kazuhiro Shintani

Figure 1 for CLIP-Loc: Multi-modal Landmark Association for Global Localization in Object-based Maps

Figure 2 for CLIP-Loc: Multi-modal Landmark Association for Global Localization in Object-based Maps

Figure 3 for CLIP-Loc: Multi-modal Landmark Association for Global Localization in Object-based Maps

Figure 4 for CLIP-Loc: Multi-modal Landmark Association for Global Localization in Object-based Maps

Share this with someone who'll enjoy it:

Abstract:This paper describes a multi-modal data association method for global localization using object-based maps and camera images. In global localization, or relocalization, using object-based maps, existing methods typically resort to matching all possible combinations of detected objects and landmarks with the same object category, followed by inlier extraction using RANSAC or brute-force search. This approach becomes infeasible as the number of landmarks increases due to the exponential growth of correspondence candidates. In this paper, we propose labeling landmarks with natural language descriptions and extracting correspondences based on conceptual similarity with image observations using a Vision Language Model (VLM). By leveraging detailed text information, our approach efficiently extracts correspondences compared to methods using only object categories. Through experiments, we demonstrate that the proposed method enables more accurate global localization with fewer iterations compared to baseline methods, exhibiting its efficiency.

* 7 pages, 7 figures. Accepted to IEEE International Conference on Robotics and Automation (ICRA) 2024

View paper on

Share this with someone who'll enjoy it:

Title:CLIP-Loc: Multi-modal Landmark Association for Global Localization in Object-based Maps

Paper and Code