Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yajie Yan

FineRecon: Depth-aware Feed-forward Network for Detailed 3D Reconstruction

Apr 04, 2023

Noah Stier, Anurag Ranjan, Alex Colburn, Yajie Yan, Liang Yang, Fangchang Ma, Baptiste Angles

Figure 1 for FineRecon: Depth-aware Feed-forward Network for Detailed 3D Reconstruction

Figure 2 for FineRecon: Depth-aware Feed-forward Network for Detailed 3D Reconstruction

Figure 3 for FineRecon: Depth-aware Feed-forward Network for Detailed 3D Reconstruction

Figure 4 for FineRecon: Depth-aware Feed-forward Network for Detailed 3D Reconstruction

Abstract:Recent works on 3D reconstruction from posed images have demonstrated that direct inference of scene-level 3D geometry without iterative optimization is feasible using a deep neural network, showing remarkable promise and high efficiency. However, the reconstructed geometries, typically represented as a 3D truncated signed distance function (TSDF), are often coarse without fine geometric details. To address this problem, we propose three effective solutions for improving the fidelity of inference-based 3D reconstructions. We first present a resolution-agnostic TSDF supervision strategy to provide the network with a more accurate learning signal during training, avoiding the pitfalls of TSDF interpolation seen in previous work. We then introduce a depth guidance strategy using multi-view depth estimates to enhance the scene representation and recover more accurate surfaces. Finally, we develop a novel architecture for the final layers of the network, conditioning the output TSDF prediction on high-resolution image features in addition to coarse voxel features, enabling sharper reconstruction of fine details. Our method produces smooth and highly accurate reconstructions, showing significant improvements across multiple depth and 3D reconstruction metrics.

Via

Access Paper or Ask Questions

LivePose: Online 3D Reconstruction from Monocular Video with Dynamic Camera Poses

Mar 31, 2023

Noah Stier, Baptiste Angles, Liang Yang, Yajie Yan, Alex Colburn, Ming Chuang

Figure 1 for LivePose: Online 3D Reconstruction from Monocular Video with Dynamic Camera Poses

Figure 2 for LivePose: Online 3D Reconstruction from Monocular Video with Dynamic Camera Poses

Figure 3 for LivePose: Online 3D Reconstruction from Monocular Video with Dynamic Camera Poses

Figure 4 for LivePose: Online 3D Reconstruction from Monocular Video with Dynamic Camera Poses

Abstract:Dense 3D reconstruction from RGB images traditionally assumes static camera pose estimates. This assumption has endured, even as recent works have increasingly focused on real-time methods for mobile devices. However, the assumption of one pose per image does not hold for online execution: poses from real-time SLAM are dynamic and may be updated following events such as bundle adjustment and loop closure. This has been addressed in the RGB-D setting, by de-integrating past views and re-integrating them with updated poses, but it remains largely untreated in the RGB-only setting. We formalize this problem to define the new task of online reconstruction from dynamically-posed images. To support further research, we introduce a dataset called LivePose containing the dynamic poses from a SLAM system running on ScanNet. We select three recent reconstruction systems and apply a framework based on de-integration to adapt each one to the dynamic-pose setting. In addition, we propose a novel, non-linear de-integration module that learns to remove stale scene content. We show that responding to pose updates is critical for high-quality reconstruction, and that our de-integration framework is an effective solution.

Via

Access Paper or Ask Questions

The Replica Dataset: A Digital Replica of Indoor Spaces

Jun 13, 2019

Julian Straub, Thomas Whelan, Lingni Ma, Yufan Chen, Erik Wijmans, Simon Green, Jakob J. Engel, Raul Mur-Artal, Carl Ren, Shobhit Verma(+20 more)

Figure 1 for The Replica Dataset: A Digital Replica of Indoor Spaces

Figure 2 for The Replica Dataset: A Digital Replica of Indoor Spaces

Figure 3 for The Replica Dataset: A Digital Replica of Indoor Spaces

Figure 4 for The Replica Dataset: A Digital Replica of Indoor Spaces

Abstract:We introduce Replica, a dataset of 18 highly photo-realistic 3D indoor scene reconstructions at room and building scale. Each scene consists of a dense mesh, high-resolution high-dynamic-range (HDR) textures, per-primitive semantic class and instance information, and planar mirror and glass reflectors. The goal of Replica is to enable machine learning (ML) research that relies on visually, geometrically, and semantically realistic generative models of the world - for instance, egocentric computer vision, semantic segmentation in 2D and 3D, geometric inference, and the development of embodied agents (virtual robots) performing navigation, instruction following, and question answering. Due to the high level of realism of the renderings from Replica, there is hope that ML systems trained on Replica may transfer directly to real world image and video data. Together with the data, we are releasing a minimal C++ SDK as a starting point for working with the Replica dataset. In addition, Replica is `Habitat-compatible', i.e. can be natively used with AI Habitat for training and testing embodied agents.

Via

Access Paper or Ask Questions