Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fengting Yang

PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos

Jun 15, 2022

Yiming Xie, Matheus Gadelha, Fengting Yang, Xiaowei Zhou, Huaizu Jiang

Figure 1 for PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos

Figure 2 for PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos

Figure 3 for PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos

Figure 4 for PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos

Abstract:We present PlanarRecon -- a novel framework for globally coherent detection and reconstruction of 3D planes from a posed monocular video. Unlike previous works that detect planes in 2D from a single image, PlanarRecon incrementally detects planes in 3D for each video fragment, which consists of a set of key frames, from a volumetric representation of the scene using neural networks. A learning-based tracking and fusion module is designed to merge planes from previous fragments to form a coherent global plane reconstruction. Such design allows PlanarRecon to integrate observations from multiple views within each fragment and temporal information across different ones, resulting in an accurate and coherent reconstruction of the scene abstraction with low-polygonal geometry. Experiments show that the proposed approach achieves state-of-the-art performances on the ScanNet dataset while being real-time.

* CVPR 2022. Project page: https://neu-vi.github.io/planarrecon/

Via

Access Paper or Ask Questions

Deep Depth from Focus with Differential Focus Volume

Dec 03, 2021

Fengting Yang, Xiaolei Huang, Zihan Zhou

Figure 1 for Deep Depth from Focus with Differential Focus Volume

Figure 2 for Deep Depth from Focus with Differential Focus Volume

Figure 3 for Deep Depth from Focus with Differential Focus Volume

Figure 4 for Deep Depth from Focus with Differential Focus Volume

Abstract:Depth-from-focus (DFF) is a technique that infers depth using the focus change of a camera. In this work, we propose a convolutional neural network (CNN) to find the best-focused pixels in a focal stack and infer depth from the focus estimation. The key innovation of the network is the novel deep differential focus volume (DFV). By computing the first-order derivative with the stacked features over different focal distances, DFV is able to capture both the focus and context information for focus analysis. Besides, we also introduce a probability regression mechanism for focus estimation to handle sparsely sampled focal stacks and provide uncertainty estimation to the final prediction. Comprehensive experiments demonstrate that the proposed model achieves state-of-the-art performance on multiple datasets with good generalizability and fast speed.

* 17 pages

Via

Access Paper or Ask Questions

Superpixel Segmentation with Fully Convolutional Networks

Mar 29, 2020

Fengting Yang, Qian Sun, Hailin Jin, Zihan Zhou

Figure 1 for Superpixel Segmentation with Fully Convolutional Networks

Figure 2 for Superpixel Segmentation with Fully Convolutional Networks

Figure 3 for Superpixel Segmentation with Fully Convolutional Networks

Figure 4 for Superpixel Segmentation with Fully Convolutional Networks

Abstract:In computer vision, superpixels have been widely used as an effective way to reduce the number of image primitives for subsequent processing. But only a few attempts have been made to incorporate them into deep neural networks. One main reason is that the standard convolution operation is defined on regular grids and becomes inefficient when applied to superpixels. Inspired by an initialization strategy commonly adopted by traditional superpixel algorithms, we present a novel method that employs a simple fully convolutional network to predict superpixels on a regular image grid. Experimental results on benchmark datasets show that our method achieves state-of-the-art superpixel segmentation performance while running at about 50fps. Based on the predicted superpixels, we further develop a downsampling/upsampling scheme for deep networks with the goal of generating high-resolution outputs for dense prediction tasks. Specifically, we modify a popular network architecture for stereo matching to simultaneously predict superpixels and disparities. We show that improved disparity estimation accuracy can be obtained on public datasets.

* 16 pages, 15 figures, to be published in CVPR'20

Via

Access Paper or Ask Questions