Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Janarbek Matai

Factorized Inverse Path Tracing for Efficient and Accurate Material-Lighting Estimation

Apr 12, 2023

Liwen Wu, Rui Zhu, Mustafa B. Yaldiz, Yinhao Zhu, Hong Cai, Janarbek Matai, Fatih Porikli, Tzu-Mao Li, Manmohan Chandraker, Ravi Ramamoorthi

Abstract:Inverse path tracing has recently been applied to joint material and lighting estimation, given geometry and multi-view HDR observations of an indoor scene. However, it has two major limitations: path tracing is expensive to compute, and ambiguities exist between reflection and emission. We propose a novel Factorized Inverse Path Tracing (FIPT) method which utilizes a factored light transport formulation and finds emitters driven by rendering errors. Our algorithm enables accurate material and lighting optimization faster than previous work, and is more effective at resolving ambiguities. The exhaustive experiments on synthetic scenes show that our method (1) outperforms state-of-the-art indoor inverse rendering and relighting methods particularly in the presence of complex illumination effects; (2) speeds up inverse path tracing optimization to less than an hour. We further demonstrate robustness to noisy inputs through material and lighting estimates that allow plausible relighting in a real scene. The source code is available at: https://github.com/lwwu2/fipt

* Updated results from MILO on Apr 10, 2023

Via

Access Paper or Ask Questions

IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor Scenes

Jun 16, 2022

Rui Zhu, Zhengqin Li, Janarbek Matai, Fatih Porikli, Manmohan Chandraker

Figure 1 for IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor Scenes

Figure 2 for IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor Scenes

Figure 3 for IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor Scenes

Figure 4 for IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor Scenes

Abstract:Indoor scenes exhibit significant appearance variations due to myriad interactions between arbitrarily diverse object shapes, spatially-changing materials, and complex lighting. Shadows, highlights, and inter-reflections caused by visible and invisible light sources require reasoning about long-range interactions for inverse rendering, which seeks to recover the components of image formation, namely, shape, material, and lighting. In this work, our intuition is that the long-range attention learned by transformer architectures is ideally suited to solve longstanding challenges in single-image inverse rendering. We demonstrate with a specific instantiation of a dense vision transformer, IRISformer, that excels at both single-task and multi-task reasoning required for inverse rendering. Specifically, we propose a transformer architecture to simultaneously estimate depths, normals, spatially-varying albedo, roughness and lighting from a single image of an indoor scene. Our extensive evaluations on benchmark datasets demonstrate state-of-the-art results on each of the above tasks, enabling applications like object insertion and material editing in a single unconstrained real image, with greater photorealism than prior works. Code and data are publicly released at https://github.com/ViLab-UCSD/IRISformer.

* CVPR 22 camera ready version with supplementary

Via

Access Paper or Ask Questions

X-Distill: Improving Self-Supervised Monocular Depth via Cross-Task Distillation

Oct 24, 2021

Hong Cai, Janarbek Matai, Shubhankar Borse, Yizhe Zhang, Amin Ansari, Fatih Porikli

Figure 1 for X-Distill: Improving Self-Supervised Monocular Depth via Cross-Task Distillation

Figure 2 for X-Distill: Improving Self-Supervised Monocular Depth via Cross-Task Distillation

Figure 3 for X-Distill: Improving Self-Supervised Monocular Depth via Cross-Task Distillation

Figure 4 for X-Distill: Improving Self-Supervised Monocular Depth via Cross-Task Distillation

Abstract:In this paper, we propose a novel method, X-Distill, to improve the self-supervised training of monocular depth via cross-task knowledge distillation from semantic segmentation to depth estimation. More specifically, during training, we utilize a pretrained semantic segmentation teacher network and transfer its semantic knowledge to the depth network. In order to enable such knowledge distillation across two different visual tasks, we introduce a small, trainable network that translates the predicted depth map to a semantic segmentation map, which can then be supervised by the teacher network. In this way, this small network enables the backpropagation from the semantic segmentation teacher's supervision to the depth network during training. In addition, since the commonly used object classes in semantic segmentation are not directly transferable to depth, we study the visual and geometric characteristics of the objects and design a new way of grouping them that can be shared by both tasks. It is noteworthy that our approach only modifies the training process and does not incur additional computation during inference. We extensively evaluate the efficacy of our proposed approach on the standard KITTI benchmark and compare it with the latest state of the art. We further test the generalizability of our approach on Make3D. Overall, the results show that our approach significantly improves the depth estimation accuracy and outperforms the state of the art.

* Accepted to BMVC 2021

Via

Access Paper or Ask Questions