Abstract:We address the problem of detecting and erasing furniture from a wide angle photograph of a room. Inpainting large regions of an indoor scene often results in geometric inconsistencies of background elements within the inpaint mask. To address this problem, we utilize perceptual information (e.g. instance segmentation, and room layout) to produce a geometrically consistent empty version of a room. We share important details to make this system viable, such as per-plane inpainting, automatic rectification, and texture refinement. We provide detailed ablation along with qualitative examples, justifying our design choices. We show an application of our system by removing real furniture from a room and redecorating it with virtual furniture.
Abstract:In this paper, we address the problem of degradation in inpainting quality of neural networks operating at high resolutions. Inpainting networks are often unable to generate globally coherent structures at resolutions higher than their training set. This is partially attributed to the receptive field remaining static, despite an increase in image resolution. Although downscaling the image prior to inpainting produces coherent structure, it inherently lacks detail present at higher resolutions. To get the best of both worlds, we optimize the intermediate featuremaps of a network by minimizing a multiscale consistency loss at inference. This runtime optimization improves the inpainting results and establishes a new state-of-the-art for high resolution inpainting. Code is available at: https://github.com/geomagical/lama-with-refiner/tree/refinement.
Abstract:Illumination estimation is often used in mixed reality to re-render a scene from another point of view, to change the color/texture of an object, or to insert a virtual object consistently lit into a real video or photograph. Specifically, the estimation of a point light source is required for the shadows cast by the inserted object to be consistent with the real scene. We tackle the problem of illumination retrieval given an RGBD image of the scene as an inverse problem: we aim to find the illumination that minimizes the photometric error between the rendered image and the observation. In particular we propose a novel differentiable renderer based on the Blinn-Phong model with cast shadows. We compare our differentiable renderer to state-of-the-art methods and demonstrate its robustness to an incorrect reflectance estimation.