Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

Jun 02, 2023

Ayush Agrawal, Raghav Arora, Ahana Datta, Snehasis Banerjee, Brojeshwar Bhowmick, Krishna Murthy Jatavallabhula, Mohan Sridharan, Madhava Krishna

Figure 1 for CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

Figure 2 for CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

Figure 3 for CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

Figure 4 for CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

Share this with someone who'll enjoy it:

Abstract:This paper introduces a novel method for determining the best room to place an object in, for embodied scene rearrangement. While state-of-the-art approaches rely on large language models (LLMs) or reinforcement learned (RL) policies for this task, our approach, CLIPGraphs, efficiently combines commonsense domain knowledge, data-driven methods, and recent advances in multimodal learning. Specifically, it (a)encodes a knowledge graph of prior human preferences about the room location of different objects in home environments, (b) incorporates vision-language features to support multimodal queries based on images or text, and (c) uses a graph network to learn object-room affinities based on embeddings of the prior knowledge and the vision-language features. We demonstrate that our approach provides better estimates of the most appropriate location of objects from a benchmark set of object categories in comparison with state-of-the-art baselines

* RO-MAN 2023 Conference

View paper on

Share this with someone who'll enjoy it:

Title:CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

Paper and Code