Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Olaf Kähler

InfiniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure

Aug 02, 2017

Victor Adrian Prisacariu, Olaf Kähler, Stuart Golodetz, Michael Sapienza, Tommaso Cavallari, Philip H S Torr, David W Murray

Figure 1 for InfiniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure

Figure 2 for InfiniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure

Figure 3 for InfiniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure

Figure 4 for InfiniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure

Abstract:Volumetric models have become a popular representation for 3D scenes in recent years. One breakthrough leading to their popularity was KinectFusion, which focuses on 3D reconstruction using RGB-D sensors. However, monocular SLAM has since also been tackled with very similar approaches. Representing the reconstruction volumetrically as a TSDF leads to most of the simplicity and efficiency that can be achieved with GPU implementations of these systems. However, this representation is memory-intensive and limits applicability to small-scale reconstructions. Several avenues have been explored to overcome this. With the aim of summarizing them and providing for a fast, flexible 3D reconstruction pipeline, we propose a new, unifying framework called InfiniTAM. The idea is that steps like camera tracking, scene representation and integration of new data can easily be replaced and adapted to the user's needs. This report describes the technical implementation details of InfiniTAM v3, the third version of our InfiniTAM system. We have added various new features, as well as making numerous enhancements to the low-level code that significantly improve our camera tracking performance. The new features that we expect to be of most interest are (i) a robust camera tracking module; (ii) an implementation of Glocker et al.'s keyframe-based random ferns camera relocaliser; (iii) a novel approach to globally-consistent TSDF-based reconstruction, based on dividing the scene into rigid submaps and optimising the relative poses between them; and (iv) an implementation of Keller et al.'s surfel-based reconstruction approach.

* This article largely supersedes arxiv:1410.0925 (it describes version 3 of the InfiniTAM framework)

Via

Access Paper or Ask Questions

SemanticPaint: A Framework for the Interactive Segmentation of 3D Scenes

Oct 13, 2015

Stuart Golodetz, Michael Sapienza, Julien P. C. Valentin, Vibhav Vineet, Ming-Ming Cheng, Anurag Arnab, Victor A. Prisacariu, Olaf Kähler, Carl Yuheng Ren, David W. Murray(+2 more)

Figure 1 for SemanticPaint: A Framework for the Interactive Segmentation of 3D Scenes

Figure 2 for SemanticPaint: A Framework for the Interactive Segmentation of 3D Scenes

Figure 3 for SemanticPaint: A Framework for the Interactive Segmentation of 3D Scenes

Figure 4 for SemanticPaint: A Framework for the Interactive Segmentation of 3D Scenes

Abstract:We present an open-source, real-time implementation of SemanticPaint, a system for geometric reconstruction, object-class segmentation and learning of 3D scenes. Using our system, a user can walk into a room wearing a depth camera and a virtual reality headset, and both densely reconstruct the 3D scene and interactively segment the environment into object classes such as 'chair', 'floor' and 'table'. The user interacts physically with the real-world scene, touching objects and using voice commands to assign them appropriate labels. These user-generated labels are leveraged by an online random forest-based machine learning algorithm, which is used to predict labels for previously unseen parts of the scene. The entire pipeline runs in real time, and the user stays 'in the loop' throughout the process, receiving immediate feedback about the progress of the labelling and interacting with the scene as necessary to refine the predicted segmentation.

* 33 pages, Project: http://www.semantic-paint.com, Code: https://github.com/torrvision/spaint

Via

Access Paper or Ask Questions

A Framework for the Volumetric Integration of Depth Images

Oct 23, 2014

Victor Adrian Prisacariu, Olaf Kähler, Ming Ming Cheng, Carl Yuheng Ren, Julien Valentin, Philip H. S. Torr, Ian D. Reid, David W. Murray

Figure 1 for A Framework for the Volumetric Integration of Depth Images

Figure 2 for A Framework for the Volumetric Integration of Depth Images

Figure 3 for A Framework for the Volumetric Integration of Depth Images

Figure 4 for A Framework for the Volumetric Integration of Depth Images

Abstract:Volumetric models have become a popular representation for 3D scenes in recent years. One of the breakthroughs leading to their popularity was KinectFusion, where the focus is on 3D reconstruction using RGB-D sensors. However, monocular SLAM has since also been tackled with very similar approaches. Representing the reconstruction volumetrically as a truncated signed distance function leads to most of the simplicity and efficiency that can be achieved with GPU implementations of these systems. However, this representation is also memory-intensive and limits the applicability to small scale reconstructions. Several avenues have been explored for overcoming this limitation. With the aim of summarizing them and providing for a fast and flexible 3D reconstruction pipeline, we propose a new, unifying framework called InfiniTAM. The core idea is that individual steps like camera tracking, scene representation and integration of new data can easily be replaced and adapted to the needs of the user. Along with the framework we also provide a set of components for scalable reconstruction: two implementations of camera trackers, based on RGB data and on depth data, two representations of the 3D volumetric data, a dense volume and one based on hashes of subblocks, and an optional module for swapping subblocks in and out of the typically limited GPU memory.

* 17 pages, 8 figures

Via

Access Paper or Ask Questions