Abstract:Each photo in an image burst can be considered a sample of a complex 3D scene: the product of parallax, diffuse and specular materials, scene motion, and illuminant variation. While decomposing all of these effects from a stack of misaligned images is a highly ill-conditioned task, the conventional align-and-merge burst pipeline takes the other extreme: blending them into a single image. In this work, we propose a versatile intermediate representation: a two-layer alpha-composited image plus flow model constructed with neural spline fields -- networks trained to map input coordinates to spline control points. Our method is able to, during test-time optimization, jointly fuse a burst image capture into one high-resolution reconstruction and decompose it into transmission and obstruction layers. Then, by discarding the obstruction layer, we can perform a range of tasks including seeing through occlusions, reflection suppression, and shadow removal. Validated on complex synthetic and in-the-wild captures we find that, with no post-processing steps or learned priors, our generalizable model is able to outperform existing dedicated single-image and multi-view obstruction removal approaches.
Abstract:Can the latent spaces of modern generative neural rendering models serve as representations for 3D-aware discriminative visual understanding tasks? We use retrieval as a proxy for measuring the metric learning properties of the latent spaces of Shap-E, including capturing view-independence and enabling the aggregation of scene representations from the representations of individual image views, and find that Shap-E representations outperform those of the classical EfficientNet baseline representations zero-shot, and is still competitive when both methods are trained using a contrative loss. These findings give preliminary indication that 3D-based rendering and generative models can yield useful representations for discriminative tasks in our innately 3D-native world. Our code is available at \url{https://github.com/michaelwilliamtang/golden-retriever}.