Abstract:Through extensive research on deep learning in recent years and its application in construction, crack detection has evolved rapidly from rough detection at the image-level and patch-level to fine-grained detection at the pixel-level, which better suits the nature of this field. Despite numerous existing studies utilizing off-the-shelf deep learning models or enhancing them, these models are not always effective or efficient in real-world applications. In order to bridge this gap, we propose a High-resolution model with Semantic guidance, specifically designed for real-time crack segmentation, referred to as HrSegNet. Our model maintains high resolution throughout the entire process, as opposed to recovering from low-resolution features to high-resolution ones, thereby maximizing the preservation of crack details. Moreover, to enhance the context information, we use low-resolution semantic features to guide the reconstruction of high-resolution features. To ensure the efficiency of the algorithm, we design a simple yet effective method to control the computation cost of the entire model by controlling the capacity of high-resolution channels, while providing the model with extremely strong scalability. Extensive quantitative and qualitative evaluations demonstrate that our proposed HrSegNet has exceptional crack segmentation capabilities, and that maintaining high resolution and semantic guidance are crucial to the final prediction. Compared to state-of-the-art segmentation models, HrSegNet achieves the best trade-off between efficiency and effectiveness. Specifically, on the crack dataset CrackSeg9k, our fastest model HrSegNet-B16 achieves a speed of 182 FPS with 78.43% mIoU, while our most accurate model HrSegNet-B48 achieves 80.32% mIoU with an inference speed of 140.3 FPS.