Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yixiang Jiang

S3IM: Stochastic Structural SIMilarity and Its Unreasonable Effectiveness for Neural Fields

Aug 14, 2023

Zeke Xie, Xindi Yang, Yujie Yang, Qi Sun, Yixiang Jiang, Haoran Wang, Yunfeng Cai, Mingming Sun

Figure 1 for S3IM: Stochastic Structural SIMilarity and Its Unreasonable Effectiveness for Neural Fields

Figure 2 for S3IM: Stochastic Structural SIMilarity and Its Unreasonable Effectiveness for Neural Fields

Figure 3 for S3IM: Stochastic Structural SIMilarity and Its Unreasonable Effectiveness for Neural Fields

Figure 4 for S3IM: Stochastic Structural SIMilarity and Its Unreasonable Effectiveness for Neural Fields

Abstract:Recently, Neural Radiance Field (NeRF) has shown great success in rendering novel-view images of a given scene by learning an implicit representation with only posed RGB images. NeRF and relevant neural field methods (e.g., neural surface representation) typically optimize a point-wise loss and make point-wise predictions, where one data point corresponds to one pixel. Unfortunately, this line of research failed to use the collective supervision of distant pixels, although it is known that pixels in an image or scene can provide rich structural information. To the best of our knowledge, we are the first to design a nonlocal multiplex training paradigm for NeRF and relevant neural field methods via a novel Stochastic Structural SIMilarity (S3IM) loss that processes multiple data points as a whole set instead of process multiple inputs independently. Our extensive experiments demonstrate the unreasonable effectiveness of S3IM in improving NeRF and neural surface representation for nearly free. The improvements of quality metrics can be particularly significant for those relatively difficult tasks: e.g., the test MSE loss unexpectedly drops by more than 90% for TensoRF and DVGO over eight novel view synthesis tasks; a 198% F-score gain and a 64% Chamfer $L_{1}$ distance reduction for NeuS over eight surface reconstruction tasks. Moreover, S3IM is consistently robust even with sparse inputs, corrupted images, and dynamic scenes.

* ICCV 2023 main conference. Code: https://github.com/Madaoer/S3IM. 14 pages, 5 figures, 17 tables

Via

Access Paper or Ask Questions