Abstract:Road infrastructure maintenance in developing countries faces unique challenges due to resource constraints and diverse environmental factors. This study addresses the critical need for efficient, accurate, and locally-relevant pavement distress detection methods in these regions. We present a novel deep learning approach combining YOLO (You Only Look Once) object detection models with a Convolutional Block Attention Module (CBAM) to simultaneously detect and classify multiple pavement distress types. The model demonstrates robust performance in detecting and classifying potholes, longitudinal cracks, alligator cracks, and raveling, with confidence scores ranging from 0.46 to 0.93. While some misclassifications occur in complex scenarios, these provide insights into unique challenges of pavement assessment in developing countries. Additionally, we developed a web-based application for real-time distress detection from images and videos. This research advances automated pavement distress detection and provides a tailored solution for developing countries, potentially improving road safety, optimizing maintenance strategies, and contributing to sustainable transportation infrastructure development.
Abstract:This research introduces the first multimodal approach for pavement condition assessment, providing both quantitative Pavement Condition Index (PCI) predictions and qualitative descriptions. We introduce PaveCap, a novel framework for automated pavement condition assessment. The framework consists of two main parts: a Single-Shot PCI Estimation Network and a Dense Captioning Network. The PCI Estimation Network uses YOLOv8 for object detection, the Segment Anything Model (SAM) for zero-shot segmentation, and a four-layer convolutional neural network to predict PCI. The Dense Captioning Network uses a YOLOv8 backbone, a Transformer encoder-decoder architecture, and a convolutional feed-forward module to generate detailed descriptions of pavement conditions. To train and evaluate these networks, we developed a pavement dataset with bounding box annotations, textual annotations, and PCI values. The results of our PCI Estimation Network showed a strong positive correlation (0.70) between predicted and actual PCIs, demonstrating its effectiveness in automating condition assessment. Also, the Dense Captioning Network produced accurate pavement condition descriptions, evidenced by high BLEU (0.7445), GLEU (0.5893), and METEOR (0.7252) scores. Additionally, the dense captioning model handled complex scenarios well, even correcting some errors in the ground truth data. The framework developed here can greatly improve infrastructure management and decision18 making in pavement maintenance.