Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Aerial Flood Scene Classification Using Fine-Tuned Attention-based Architecture for Flood-Prone Countries in South Asia

Oct 31, 2024

Ibne Hassan, Aman Mujahid, Abdullah Al Hasib, Andalib Rahman Shagoto, Joyanta Jyoti Mondal, Meem Arafat Manab, Jannatun Noor

Figure 1 for Aerial Flood Scene Classification Using Fine-Tuned Attention-based Architecture for Flood-Prone Countries in South Asia

Figure 2 for Aerial Flood Scene Classification Using Fine-Tuned Attention-based Architecture for Flood-Prone Countries in South Asia

Figure 3 for Aerial Flood Scene Classification Using Fine-Tuned Attention-based Architecture for Flood-Prone Countries in South Asia

Figure 4 for Aerial Flood Scene Classification Using Fine-Tuned Attention-based Architecture for Flood-Prone Countries in South Asia

Share this with someone who'll enjoy it:

Abstract:Countries in South Asia experience many catastrophic flooding events regularly. Through image classification, it is possible to expedite search and rescue initiatives by classifying flood zones, including houses and humans. We create a new dataset collecting aerial imagery of flooding events across South Asian countries. For the classification, we propose a fine-tuned Compact Convolutional Transformer (CCT) based approach and some other cutting-edge transformer-based and Convolutional Neural Network-based architectures (CNN). We also implement the YOLOv8 object detection model and detect houses and humans within the imagery of our proposed dataset, and then compare the performance with our classification-based approach. Since the countries in South Asia have similar topography, housing structure, the color of flood water, and vegetation, this work can be more applicable to such a region as opposed to the rest of the world. The images are divided evenly into four classes: 'flood', 'flood with domicile', 'flood with humans', and 'no flood'. After experimenting with our proposed dataset on our fine-tuned CCT model, which has a comparatively lower number of weight parameters than many other transformer-based architectures designed for computer vision, it exhibits an accuracy and macro average precision of 98.62% and 98.50%. The other transformer-based architectures that we implement are the Vision Transformer (ViT), Swin Transformer, and External Attention Transformer (EANet), which give an accuracy of 88.66%, 84.74%, and 66.56% respectively. We also implement DCECNN (Deep Custom Ensembled Convolutional Neural Network), which is a custom ensemble model that we create by combining MobileNet, InceptionV3, and EfficientNetB0, and we obtain an accuracy of 98.78%. The architectures we implement are fine-tuned to achieve optimal performance on our dataset.

View paper on

Share this with someone who'll enjoy it:

Title:Aerial Flood Scene Classification Using Fine-Tuned Attention-based Architecture for Flood-Prone Countries in South Asia

Paper and Code