Abstract:Digitalization in the construction industry has become essential, enabling centralized, easy access to all relevant information of a building. Automated systems can facilitate the timely and resource-efficient documentation of changes, which is crucial for key processes such as progress tracking and quality control. This paper presents a method for image-based automated drywall analysis enabling construction progress and quality assessment through on-site camera systems. Our proposed solution integrates a deep learning-based instance segmentation model to detect and classify various drywall elements with an analysis module to cluster individual wall segments, estimate camera perspective distortions, and apply the corresponding corrections. This system extracts valuable information from images, enabling more accurate progress tracking and quality assessment on construction sites. Our main contributions include a fully automated pipeline for drywall analysis, improving instance segmentation accuracy through architecture modifications and targeted data augmentation, and a novel algorithm to extract important information from the segmentation results. Our modified model, enhanced with data augmentation, achieves significantly higher accuracy compared to other architectures, offering more detailed and precise information than existing approaches. Combined with the proposed drywall analysis steps, it enables the reliable automation of construction progress and quality assessment.
Abstract:We present Decomposer, a semi-supervised reconstruction model that decomposes distorted image sequences into their fundamental building blocks - the original image and the applied augmentations, i.e., shadow, light, and occlusions. To solve this problem, we use the SIDAR dataset that provides a large number of distorted image sequences: each sequence contains images with shadows, lighting, and occlusions applied to an undistorted version. Each distortion changes the original signal in different ways, e.g., additive or multiplicative noise. We propose a transformer-based model to explicitly learn this decomposition. The sequential model uses 3D Swin-Transformers for spatio-temporal encoding and 3D U-Nets as prediction heads for individual parts of the decomposition. We demonstrate that by separately pre-training our model on weakly supervised pseudo labels, we can steer our model to optimize for our ambiguous problem definition and learn to differentiate between the different image distortions.
Abstract:Digitalization of existing buildings and the creation of 3D BIM models for them has become crucial for many tasks. Of particular importance are floor plans, which contain information about building layouts and are vital for processes such as construction, maintenance or refurbishing. However, this data is not always available in digital form, especially for older buildings constructed before CAD tools were widely available, or lacks semantic information. The digitalization of such information usually requires manual work of an expert that must reconstruct the layouts by hand, which is a cumbersome and error-prone process. In this paper, we present a pipeline for reconstruction of vectorized 3D models from scanned 2D plans, aiming at increasing the efficiency of this process. The method presented achieves state-of-the-art results in the public dataset CubiCasa5k, and shows good generalization to different types of plans. Our vectorization approach is particularly effective, outperforming previous methods.