Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Huizhong Zhou

CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth

Apr 03, 2019

Jose M. Facil, Benjamin Ummenhofer, Huizhong Zhou, Luis Montesano, Thomas Brox, Javier Civera

Figure 1 for CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth

Figure 2 for CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth

Figure 3 for CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth

Figure 4 for CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth

Abstract:Single-view depth estimation suffers from the problem that a network trained on images from one camera does not generalize to images taken with a different camera model. Thus, changing the camera model requires collecting an entirely new training dataset. In this work, we propose a new type of convolution that can take the camera parameters into account, thus allowing neural networks to learn calibration-aware patterns. Experiments confirm that this improves the generalization capabilities of depth prediction networks considerably, and clearly outperforms the state of the art when the train and test images are acquired with different cameras.

* Camera ready version for CVPR 2019. Project page: http://webdiis.unizar.es/~jmfacil/camconvs/

Via

Access Paper or Ask Questions

DeepTAM: Deep Tracking and Mapping

Aug 07, 2018

Huizhong Zhou, Benjamin Ummenhofer, Thomas Brox

Figure 1 for DeepTAM: Deep Tracking and Mapping

Figure 2 for DeepTAM: Deep Tracking and Mapping

Figure 3 for DeepTAM: Deep Tracking and Mapping

Figure 4 for DeepTAM: Deep Tracking and Mapping

Abstract:We present a system for keyframe-based dense camera tracking and depth map estimation that is entirely learned. For tracking, we estimate small pose increments between the current camera image and a synthetic viewpoint. This significantly simplifies the learning problem and alleviates the dataset bias for camera motions. Further, we show that generating a large number of pose hypotheses leads to more accurate predictions. For mapping, we accumulate information in a cost volume centered at the current depth estimate. The mapping network then combines the cost volume and the keyframe image to update the depth prediction, thereby effectively making use of depth measurements and image-based priors. Our approach yields state-of-the-art results with few images and is robust with respect to noisy camera poses. We demonstrate that the performance of our 6 DOF tracking competes with RGB-D tracking algorithms. We compare favorably against strong classic and deep learning powered dense depth algorithms.

* Accepted to ECCV 2018 as oral. Project page: https://lmb.informatik.uni-freiburg.de/people/zhouh/deeptam/

Via

Access Paper or Ask Questions

DeMoN: Depth and Motion Network for Learning Monocular Stereo

Apr 11, 2017

Benjamin Ummenhofer, Huizhong Zhou, Jonas Uhrig, Nikolaus Mayer, Eddy Ilg, Alexey Dosovitskiy, Thomas Brox

Figure 1 for DeMoN: Depth and Motion Network for Learning Monocular Stereo

Figure 2 for DeMoN: Depth and Motion Network for Learning Monocular Stereo

Figure 3 for DeMoN: Depth and Motion Network for Learning Monocular Stereo

Figure 4 for DeMoN: Depth and Motion Network for Learning Monocular Stereo

Abstract:In this paper we formulate structure from motion as a learning problem. We train a convolutional network end-to-end to compute depth and camera motion from successive, unconstrained image pairs. The architecture is composed of multiple stacked encoder-decoder networks, the core part being an iterative network that is able to improve its own predictions. The network estimates not only depth and motion, but additionally surface normals, optical flow between the images and confidence of the matching. A crucial component of the approach is a training loss based on spatial relative differences. Compared to traditional two-frame structure from motion methods, results are more accurate and more robust. In contrast to the popular depth-from-single-image networks, DeMoN learns the concept of matching and, thus, better generalizes to structures not seen during training.

* Camera ready version for CVPR 2017. Supplementary material included. Project page: http://lmb.informatik.uni-freiburg.de/people/ummenhof/depthmotionnet/

Via

Access Paper or Ask Questions