Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhi-Yang He

3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera

Oct 06, 2019

Iro Armeni, Zhi-Yang He, JunYoung Gwak, Amir R. Zamir, Martin Fischer, Jitendra Malik, Silvio Savarese

Figure 1 for 3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera

Figure 2 for 3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera

Figure 3 for 3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera

Figure 4 for 3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera

Abstract:A comprehensive semantic understanding of a scene is important for many applications - but in what space should diverse semantic information (e.g., objects, scene categories, material types, texture, etc.) be grounded and what should be its structure? Aspiring to have one unified structure that hosts diverse types of semantics, we follow the Scene Graph paradigm in 3D, generating a 3D Scene Graph. Given a 3D mesh and registered panoramic images, we construct a graph that spans the entire building and includes semantics on objects (e.g., class, material, and other attributes), rooms (e.g., scene category, volume, etc.) and cameras (e.g., location, etc.), as well as the relationships among these entities. However, this process is prohibitively labor heavy if done manually. To alleviate this we devise a semi-automatic framework that employs existing detection methods and enhances them using two main constraints: I. framing of query images sampled on panoramas to maximize the performance of 2D detectors, and II. multi-view consistency enforcement across 2D detections that originate in different camera locations.

* ICCV 2019

Via

Access Paper or Ask Questions

Gibson Env: Real-World Perception for Embodied Agents

Aug 31, 2018

Fei Xia, Amir Zamir, Zhi-Yang He, Alexander Sax, Jitendra Malik, Silvio Savarese

Figure 1 for Gibson Env: Real-World Perception for Embodied Agents

Figure 2 for Gibson Env: Real-World Perception for Embodied Agents

Figure 3 for Gibson Env: Real-World Perception for Embodied Agents

Figure 4 for Gibson Env: Real-World Perception for Embodied Agents

Abstract:Developing visual perception models for active agents and sensorimotor control are cumbersome to be done in the physical world, as existing algorithms are too slow to efficiently learn in real-time and robots are fragile and costly. This has given rise to learning-in-simulation which consequently casts a question on whether the results transfer to real-world. In this paper, we are concerned with the problem of developing real-world perception for active agents, propose Gibson Virtual Environment for this purpose, and showcase sample perceptual tasks learned therein. Gibson is based on virtualizing real spaces, rather than using artificially designed ones, and currently includes over 1400 floor spaces from 572 full buildings. The main characteristics of Gibson are: I. being from the real-world and reflecting its semantic complexity, II. having an internal synthesis mechanism, "Goggles", enabling deploying the trained models in real-world without needing further domain adaptation, III. embodiment of agents and making them subject to constraints of physics and space.

* CVPR 2018
* Access the code, dataset, and project website at http://gibsonenv.vision/ . CVPR 2018

Via

Access Paper or Ask Questions