Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Learning from Feedback: Semantic Enhancement for Object SLAM Using Foundation Models

Nov 11, 2024

Jungseok Hong, Ran Choi, John J. Leonard

Figure 1 for Learning from Feedback: Semantic Enhancement for Object SLAM Using Foundation Models

Figure 2 for Learning from Feedback: Semantic Enhancement for Object SLAM Using Foundation Models

Figure 3 for Learning from Feedback: Semantic Enhancement for Object SLAM Using Foundation Models

Figure 4 for Learning from Feedback: Semantic Enhancement for Object SLAM Using Foundation Models

Share this with someone who'll enjoy it:

Abstract:Semantic Simultaneous Localization and Mapping (SLAM) systems struggle to map semantically similar objects in close proximity, especially in cluttered indoor environments. We introduce Semantic Enhancement for Object SLAM (SEO-SLAM), a novel SLAM system that leverages Vision-Language Models (VLMs) and Multimodal Large Language Models (MLLMs) to enhance object-level semantic mapping in such environments. SEO-SLAM tackles existing challenges by (1) generating more specific and descriptive open-vocabulary object labels using MLLMs, (2) simultaneously correcting factors causing erroneous landmarks, and (3) dynamically updating a multiclass confusion matrix to mitigate object detector biases. Our approach enables more precise distinctions between similar objects and maintains map coherence by reflecting scene changes through MLLM feedback. We evaluate SEO-SLAM on our challenging dataset, demonstrating enhanced accuracy and robustness in environments with multiple similar objects. Our system outperforms existing approaches in terms of landmark matching accuracy and semantic consistency. Results show the feedback from MLLM improves object-centric semantic mapping. Our dataset is publicly available at: jungseokhong.com/SEO-SLAM.

View paper on

Share this with someone who'll enjoy it:

Title:Learning from Feedback: Semantic Enhancement for Object SLAM Using Foundation Models

Paper and Code