Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Depth Field Networks for Generalizable Multi-view Scene Representation

Jul 28, 2022

Vitor Guizilini, Igor Vasiljevic, Jiading Fang, Rares Ambrus, Greg Shakhnarovich, Matthew Walter, Adrien Gaidon

Figure 1 for Depth Field Networks for Generalizable Multi-view Scene Representation

Figure 2 for Depth Field Networks for Generalizable Multi-view Scene Representation

Figure 3 for Depth Field Networks for Generalizable Multi-view Scene Representation

Figure 4 for Depth Field Networks for Generalizable Multi-view Scene Representation

Share this with someone who'll enjoy it:

Abstract:Modern 3D computer vision leverages learning to boost geometric reasoning, mapping image data to classical structures such as cost volumes or epipolar constraints to improve matching. These architectures are specialized according to the particular problem, and thus require significant task-specific tuning, often leading to poor domain generalization performance. Recently, generalist Transformer architectures have achieved impressive results in tasks such as optical flow and depth estimation by encoding geometric priors as inputs rather than as enforced constraints. In this paper, we extend this idea and propose to learn an implicit, multi-view consistent scene representation, introducing a series of 3D data augmentation techniques as a geometric inductive prior to increase view diversity. We also show that introducing view synthesis as an auxiliary task further improves depth estimation. Our Depth Field Networks (DeFiNe) achieve state-of-the-art results in stereo and video depth estimation without explicit geometric constraints, and improve on zero-shot domain generalization by a wide margin.

* Accepted to ECCV 2022. Project page: https://sites.google.com/view/tri-define

View paper on

Share this with someone who'll enjoy it:

Title:Depth Field Networks for Generalizable Multi-view Scene Representation

Paper and Code