Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Leonardo Taccari

RendBEV: Semantic Novel View Synthesis for Self-Supervised Bird's Eye View Segmentation

Feb 20, 2025

Henrique Piñeiro Monteagudo, Leonardo Taccari, Aurel Pjetri, Francesco Sambo, Samuele Salti

Abstract:Bird's Eye View (BEV) semantic maps have recently garnered a lot of attention as a useful representation of the environment to tackle assisted and autonomous driving tasks. However, most of the existing work focuses on the fully supervised setting, training networks on large annotated datasets. In this work, we present RendBEV, a new method for the self-supervised training of BEV semantic segmentation networks, leveraging differentiable volumetric rendering to receive supervision from semantic perspective views computed by a 2D semantic segmentation model. Our method enables zero-shot BEV semantic segmentation, and already delivers competitive results in this challenging setting. When used as pretraining to then fine-tune on labeled BEV ground-truth, our method significantly boosts performance in low-annotation regimes, and sets a new state of the art when fine-tuning on all available labels.

* Accepted at WACV 2025

Via

Access Paper or Ask Questions

A New Dataset for Monocular Depth Estimation Under Viewpoint Shifts

Sep 27, 2024

Aurel Pjetri, Stefano Caprasecca, Leonardo Taccari, Matteo Simoncini, Henrique Piñeiro Monteagudo, Walter Wallace, Douglas Coimbra de Andrade, Francesco Sambo, Andrew David Bagdanov

Figure 1 for A New Dataset for Monocular Depth Estimation Under Viewpoint Shifts

Figure 2 for A New Dataset for Monocular Depth Estimation Under Viewpoint Shifts

Figure 3 for A New Dataset for Monocular Depth Estimation Under Viewpoint Shifts

Figure 4 for A New Dataset for Monocular Depth Estimation Under Viewpoint Shifts

Abstract:Monocular depth estimation is a critical task for autonomous driving and many other computer vision applications. While significant progress has been made in this field, the effects of viewpoint shifts on depth estimation models remain largely underexplored. This paper introduces a novel dataset and evaluation methodology to quantify the impact of different camera positions and orientations on monocular depth estimation performance. We propose a ground truth strategy based on homography estimation and object detection, eliminating the need for expensive lidar sensors. We collect a diverse dataset of road scenes from multiple viewpoints and use it to assess the robustness of a modern depth estimation model to geometric shifts. After assessing the validity of our strategy on a public dataset, we provide valuable insights into the limitations of current models and highlight the importance of considering viewpoint variations in real-world applications.

* 17 pages, 5 figures. Accepted at ECCV 2024 2nd Workshop on Vision-Centric Autonomous Driving (VCAD)

Via

Access Paper or Ask Questions