Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:"Where am I?" Scene Retrieval with Language

Apr 22, 2024

Jiaqi Chen, Daniel Barath, Iro Armeni, Marc Pollefeys, Hermann Blum

Figure 1 for "Where am I?" Scene Retrieval with Language

Figure 2 for "Where am I?" Scene Retrieval with Language

Figure 3 for "Where am I?" Scene Retrieval with Language

Figure 4 for "Where am I?" Scene Retrieval with Language

Share this with someone who'll enjoy it:

Abstract:Natural language interfaces to embodied AI are becoming more ubiquitous in our daily lives. This opens further opportunities for language-based interaction with embodied agents, such as a user instructing an agent to execute some task in a specific location. For example, "put the bowls back in the cupboard next to the fridge" or "meet me at the intersection under the red sign." As such, we need methods that interface between natural language and map representations of the environment. To this end, we explore the question of whether we can use an open-set natural language query to identify a scene represented by a 3D scene graph. We define this task as "language-based scene-retrieval" and it is closely related to "coarse-localization," but we are instead searching for a match from a collection of disjoint scenes and not necessarily a large-scale continuous map. Therefore, we present Text2SceneGraphMatcher, a "scene-retrieval" pipeline that learns joint embeddings between text descriptions and scene graphs to determine if they are matched. The code, trained models, and datasets will be made public.

View paper on

Share this with someone who'll enjoy it:

Title:"Where am I?" Scene Retrieval with Language

Paper and Code