Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Saghar Irandoust

JigsawPlan: Room Layout Jigsaw Puzzle Extreme Structure from Motion using Diffusion Models

Nov 24, 2022

Sepidehsadat Hosseini, Mohammad Amin Shabani, Saghar Irandoust, Yasutaka Furukawa

Figure 1 for JigsawPlan: Room Layout Jigsaw Puzzle Extreme Structure from Motion using Diffusion Models

Figure 2 for JigsawPlan: Room Layout Jigsaw Puzzle Extreme Structure from Motion using Diffusion Models

Figure 3 for JigsawPlan: Room Layout Jigsaw Puzzle Extreme Structure from Motion using Diffusion Models

Figure 4 for JigsawPlan: Room Layout Jigsaw Puzzle Extreme Structure from Motion using Diffusion Models

Abstract:This paper presents a novel approach to the Extreme Structure from Motion (E-SfM) problem, which takes a set of room layouts as polygonal curves in the top-down view, and aligns the room layout pieces by estimating their 2D translations and rotations, akin to solving the jigsaw puzzle of room layouts. The biggest discovery and surprise of the paper is that the simple use of a Diffusion Model solves this challenging registration problem as a conditional generation process. The paper presents a new dataset of room layouts and floorplans for 98,780 houses. The qualitative and quantitative evaluations demonstrate that the proposed approach outperforms the competing methods by significant margins.

Via

Access Paper or Ask Questions

Training a Vision Transformer from scratch in less than 24 hours with 1 GPU

Nov 09, 2022

Saghar Irandoust, Thibaut Durand, Yunduz Rakhmangulova, Wenjie Zi, Hossein Hajimirsadeghi

Abstract:Transformers have become central to recent advances in computer vision. However, training a vision Transformer (ViT) model from scratch can be resource intensive and time consuming. In this paper, we aim to explore approaches to reduce the training costs of ViT models. We introduce some algorithmic improvements to enable training a ViT model from scratch with limited hardware (1 GPU) and time (24 hours) resources. First, we propose an efficient approach to add locality to the ViT architecture. Second, we develop a new image size curriculum learning strategy, which allows to reduce the number of patches extracted from each image at the beginning of the training. Finally, we propose a new variant of the popular ImageNet1k benchmark by adding hardware and time constraints. We evaluate our contributions on this benchmark, and show they can significantly improve performances given the proposed training budget. We will share the code in https://github.com/BorealisAI/efficient-vit-training.

* 7 pages, 2 figures, 1 table, published in "Has it Trained Yet? Workshop at the Conference on Neural Information Processing Systems (NeurIPS 2022)"

Via

Access Paper or Ask Questions

Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments

May 18, 2021

Sachini Herath, Saghar Irandoust, Bowen Chen, Yiming Qian, Pyojin Kim, Yasutaka Furukawa

Figure 1 for Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments

Figure 2 for Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments

Figure 3 for Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments

Figure 4 for Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments

Abstract:The paper proposes a multi-modal sensor fusion algorithm that fuses WiFi, IMU, and floorplan information to infer an accurate and dense location history in indoor environments. The algorithm uses 1) an inertial navigation algorithm to estimate a relative motion trajectory from IMU sensor data; 2) a WiFi-based localization API in industry to obtain positional constraints and geo-localize the trajectory; and 3) a convolutional neural network to refine the location history to be consistent with the floorplan. We have developed a data acquisition app to build a new dataset with WiFi, IMU, and floorplan data with ground-truth positions at 4 university buildings and 3 shopping malls. Our qualitative and quantitative evaluations demonstrate that the proposed system is able to produce twice as accurate and a few orders of magnitude denser location history than the current standard, while requiring minimal additional energy consumption. We will publicly share our code, data and models.

* ICRA 2021
* To be published in ICRA 2021. Code and data: https://github.com/Sachini/Fusion-DHL

Via

Access Paper or Ask Questions