Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

N. Sai Shankar

AutoLay: Benchmarking amodal layout estimation for autonomous driving

Aug 20, 2021

Kaustubh Mani, N. Sai Shankar, Krishna Murthy Jatavallabhula, K. Madhava Krishna

Figure 1 for AutoLay: Benchmarking amodal layout estimation for autonomous driving

Figure 2 for AutoLay: Benchmarking amodal layout estimation for autonomous driving

Figure 3 for AutoLay: Benchmarking amodal layout estimation for autonomous driving

Figure 4 for AutoLay: Benchmarking amodal layout estimation for autonomous driving

Abstract:Given an image or a video captured from a monocular camera, amodal layout estimation is the task of predicting semantics and occupancy in bird's eye view. The term amodal implies we also reason about entities in the scene that are occluded or truncated in image space. While several recent efforts have tackled this problem, there is a lack of standardization in task specification, datasets, and evaluation protocols. We address these gaps with AutoLay, a dataset and benchmark for amodal layout estimation from monocular images. AutoLay encompasses driving imagery from two popular datasets: KITTI and Argoverse. In addition to fine-grained attributes such as lanes, sidewalks, and vehicles, we also provide semantically annotated 3D point clouds. We implement several baselines and bleeding edge approaches, and release our data and code.

* published in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Via

Access Paper or Ask Questions

RackLay: Multi-Layer Layout Estimation for Warehouse Racks

Mar 17, 2021

Meher Shashwat Nigam, Avinash Prabhu, Anurag Sahu, Puru Gupta, Tanvi Karandikar, N. Sai Shankar, Ravi Kiran Sarvadevabhatla, K. Madhava Krishna

Figure 1 for RackLay: Multi-Layer Layout Estimation for Warehouse Racks

Figure 2 for RackLay: Multi-Layer Layout Estimation for Warehouse Racks

Figure 3 for RackLay: Multi-Layer Layout Estimation for Warehouse Racks

Figure 4 for RackLay: Multi-Layer Layout Estimation for Warehouse Racks

Abstract:Given a monocular colour image of a warehouse rack, we aim to predict the bird's-eye view layout for each shelf in the rack, which we term as multi-layer layout prediction. To this end, we present RackLay, a deep neural network for real-time shelf layout estimation from a single image. Unlike previous layout estimation methods, which provide a single layout for the dominant ground plane alone, RackLay estimates the top-view and front-view layout for each shelf in the considered rack populated with objects. RackLay's architecture and its variants are versatile and estimate accurate layouts for diverse scenes characterized by varying number of visible shelves in an image, large range in shelf occupancy factor and varied background clutter. Given the extreme paucity of datasets in this space and the difficulty involved in acquiring real data from warehouses, we additionally release a flexible synthetic dataset generation pipeline WareSynth which allows users to control the generation process and tailor the dataset according to contingent application. The ablations across architectural variants and comparison with strong prior baselines vindicate the efficacy of RackLay as an apt architecture for the novel problem of multi-layered layout estimation. We also show that fusing the top-view and front-view enables 3D reasoning applications such as metric free space estimation for the considered rack.

* Visit our project repository at https://github.com/Avinash2468/RackLay

Via

Access Paper or Ask Questions

MonoLayout: Amodal scene layout from a single image

Feb 19, 2020

Kaustubh Mani, Swapnil Daga, Shubhika Garg, N. Sai Shankar, Krishna Murthy Jatavallabhula, K. Madhava Krishna

Figure 1 for MonoLayout: Amodal scene layout from a single image

Figure 2 for MonoLayout: Amodal scene layout from a single image

Figure 3 for MonoLayout: Amodal scene layout from a single image

Figure 4 for MonoLayout: Amodal scene layout from a single image

Abstract:In this paper, we address the novel, highly challenging problem of estimating the layout of a complex urban driving scenario. Given a single color image captured from a driving platform, we aim to predict the bird's-eye view layout of the road and other traffic participants. The estimated layout should reason beyond what is visible in the image, and compensate for the loss of 3D information due to projection. We dub this problem amodal scene layout estimation, which involves "hallucinating" scene layout for even parts of the world that are occluded in the image. To this end, we present MonoLayout, a deep neural network for real-time amodal scene layout estimation from a single image. We represent scene layout as a multi-channel semantic occupancy grid, and leverage adversarial feature learning to hallucinate plausible completions for occluded image parts. Due to the lack of fair baseline methods, we extend several state-of-the-art approaches for road-layout estimation and vehicle occupancy estimation in bird's-eye view to the amodal setup for rigorous evaluation. By leveraging temporal sensor fusion to generate training labels, we significantly outperform current art over a number of datasets. On the KITTI and Argoverse datasets, we outperform all baselines by a significant margin. We also make all our annotations, and code publicly available. A video abstract of this paper is available https://www.youtube.com/watch?v=HcroGyo6yRQ .

* To be presented at WACV 2020 Video: https://www.youtube.com/watch?v=HcroGyo6yRQ Project page: https://hbutsuak95.github.io/monolayout

Via

Access Paper or Ask Questions