Abstract:Training networks to perform metric relocalization traditionally requires accurate image correspondences. In practice, these are obtained by restricting domain coverage, employing additional sensors, or capturing large multi-view datasets. We instead propose a self-supervised solution, which exploits a key insight: localizing a query image within a map should yield the same absolute pose, regardless of the reference image used for registration. Guided by this intuition, we derive a novel transform consistency loss. Using this loss function, we train a deep neural network to infer dense feature and saliency maps to perform robust metric relocalization in dynamic environments. We evaluate our framework on synthetic and real-world data, showing our approach outperforms other supervised methods when a limited amount of ground-truth information is available.
Abstract:We present a novel algorithm for light source estimation in scenes reconstructed with a RGB-D camera based on an analytically-derived formulation of path-tracing. Our algorithm traces the reconstructed scene with a custom path-tracer and computes the analytical derivatives of the light transport equation from principles in optics. These derivatives are then used to perform gradient descent, minimizing the photometric error between one or more captured reference images and renders of our current lighting estimation using an environment map parameterization for light sources. We show that our approach of modeling all light sources as points at infinity approximates lights located near the scene with surprising accuracy. Due to the analytical formulation of derivatives, optimization to the solution is considerably accelerated. We verify our algorithm using both real and synthetic data.