Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shunyi Zheng

AMMUNet: Multi-Scale Attention Map Merging for Remote Sensing Image Segmentation

Apr 20, 2024

Yang Yang, Shunyi Zheng

Figure 1 for AMMUNet: Multi-Scale Attention Map Merging for Remote Sensing Image Segmentation

Figure 2 for AMMUNet: Multi-Scale Attention Map Merging for Remote Sensing Image Segmentation

Figure 3 for AMMUNet: Multi-Scale Attention Map Merging for Remote Sensing Image Segmentation

Figure 4 for AMMUNet: Multi-Scale Attention Map Merging for Remote Sensing Image Segmentation

Abstract:The advancement of deep learning has driven notable progress in remote sensing semantic segmentation. Attention mechanisms, while enabling global modeling and utilizing contextual information, face challenges of high computational costs and require window-based operations that weaken capturing long-range dependencies, hindering their effectiveness for remote sensing image processing. In this letter, we propose AMMUNet, a UNet-based framework that employs multi-scale attention map merging, comprising two key innovations: the granular multi-head self-attention (GMSA) module and the attention map merging mechanism (AMMM). GMSA efficiently acquires global information while substantially mitigating computational costs in contrast to global multi-head self-attention mechanism. This is accomplished through the strategic utilization of dimension correspondence to align granularity and the reduction of relative position bias parameters, thereby optimizing computational efficiency. The proposed AMMM effectively combines multi-scale attention maps into a unified representation using a fixed mask template, enabling the modeling of global attention mechanism. Experimental evaluations highlight the superior performance of our approach, achieving remarkable mean intersection over union (mIoU) scores of 75.48\% on the challenging Vaihingen dataset and an exceptional 77.90\% on the Potsdam dataset, demonstrating the superiority of our method in precise remote sensing semantic segmentation. Codes are available at https://github.com/interpretty/AMMUNet.

Via

Access Paper or Ask Questions

Few-shot semantic segmentation via mask aggregation

Feb 15, 2022

Wei Ao, Shunyi Zheng, Yan Meng

Abstract:Few-shot semantic segmentation aims to recognize novel classes with only very few labelled data. This challenging task requires mining of the relevant relationships between the query image and the support images. Previous works have typically regarded it as a pixel-wise classification problem. Therefore, various models have been designed to explore the correlation of pixels between the query image and the support images. However, they focus only on pixel-wise correspondence and ignore the overall correlation of objects. In this paper, we introduce a mask-based classification method for addressing this problem. The mask aggregation network (MANet), which is a simple mask classification model, is proposed to simultaneously generate a fixed number of masks and their probabilities of being targets. Then, the final segmentation result is obtained by aggregating all the masks according to their locations. Experiments on both the PASCAL-5^i and COCO-20^i datasets show that our method performs comparably to the state-of-the-art pixel-based methods. This competitive performance demonstrates the potential of mask classification as an alternative baseline method in few-shot semantic segmentation. Our source code will be made available at https://github.com/TinyAway/MANet.

Via

Access Paper or Ask Questions

Feature Pyramid Network with Multi-Head Attention for Semantic Segmentation of Fine-Resolution Remotely Sensed Images

Feb 19, 2021

Rui Li, Shunyi Zheng, Chenxi Duan

Figure 1 for Feature Pyramid Network with Multi-Head Attention for Semantic Segmentation of Fine-Resolution Remotely Sensed Images

Figure 2 for Feature Pyramid Network with Multi-Head Attention for Semantic Segmentation of Fine-Resolution Remotely Sensed Images

Figure 3 for Feature Pyramid Network with Multi-Head Attention for Semantic Segmentation of Fine-Resolution Remotely Sensed Images

Figure 4 for Feature Pyramid Network with Multi-Head Attention for Semantic Segmentation of Fine-Resolution Remotely Sensed Images

Abstract:Semantic segmentation from fine-resolution remotely sensed images is an urgent issue in satellite imagery processing. Due to the complicated environment, automatic categorization and segmen-tation is a challenging matter especially for images with a fine resolution. Solving it can help to surmount a wide varied range of obstacles in urban planning, environmental protection, and natural landscape monitoring, which paves the way for complete scene understanding. However, the existing frequently-used encoder-decoder structure is unable to effectively combine the extracted spatial and contextual features. Therefore, in this paper, we introduce the Feature Pyramid Net-work (FPN) to bridge the gap between the low-level and high-level features. Moreover, we enhance the contextual information with the elaborate Multi-Head Attention module and propose the Feature Pyramid Network with Multi-Head Attention (FPN-MHA) for semantic segmentation of fine-resolution remotely sensed images. Extensive experiments conducted on the ISPRS Potsdam and Vaihingen datasets demonstrate the effectiveness of our FPN-MHA. Code is available at https://github.com/lironui/FPN-MHA.

Via

Access Paper or Ask Questions

Multi-stage Attention ResU-Net for Semantic Segmentation of Fine-Resolution Remote Sensing Images

Dec 01, 2020

Rui Li, Shunyi Zheng, Chenxi Duan, Jianlin Su, Ce Zhang

Figure 1 for Multi-stage Attention ResU-Net for Semantic Segmentation of Fine-Resolution Remote Sensing Images

Figure 2 for Multi-stage Attention ResU-Net for Semantic Segmentation of Fine-Resolution Remote Sensing Images

Abstract:The attention mechanism can refine the extracted feature maps and boost the classification performance of the deep network, which has become an essential technique in computer vision and natural language processing. However, the memory and computational costs of the dot-product attention mechanism increase quadratically with the spatio-temporal size of the input. Such growth hinders the usage of attention mechanisms considerably in application scenarios with large-scale inputs. In this Letter, we propose a Linear Attention Mechanism (LAM) to address this issue, which is approximately equivalent to dot-product attention with computational efficiency. Such a design makes the incorporation between attention mechanisms and deep networks much more flexible and versatile. Based on the proposed LAM, we re-factor the skip connections in the raw U-Net and design a Multi-stage Attention ResU-Net (MAResU-Net) for semantic segmentation from fine-resolution remote sensing images. Experiments conducted on the Vaihingen dataset demonstrated the effectiveness and efficiency of our MAResU-Net. Open-source code is available at https://github.com/lironui/Multistage-Attention-ResU-Net.

* arXiv admin note: substantial text overlap with arXiv:2007.14902, arXiv:2009.02130

Via

Access Paper or Ask Questions

Multi-Attention-Network for Semantic Segmentation of High-Resolution Remote Sensing Images

Sep 03, 2020

Rui Li, Shunyi Zheng, Chenxi Duan, Jianlin Su

Figure 1 for Multi-Attention-Network for Semantic Segmentation of High-Resolution Remote Sensing Images

Figure 2 for Multi-Attention-Network for Semantic Segmentation of High-Resolution Remote Sensing Images

Figure 3 for Multi-Attention-Network for Semantic Segmentation of High-Resolution Remote Sensing Images

Figure 4 for Multi-Attention-Network for Semantic Segmentation of High-Resolution Remote Sensing Images

Abstract:Semantic segmentation of remote sensing images plays an important role in land resource management, yield estimation, and economic assessment. Even though the semantic segmentation of remote sensing images has been prominently improved by convolutional neural networks, there are still several limitations contained in standard models. First, for encoder-decoder architectures like U-Net, the utilization of multi-scale features causes overuse of information, where similar low-level features are exploited at multiple scales for multiple times. Second, long-range dependencies of feature maps are not sufficiently explored, leading to feature representations associated with each semantic class are not optimal. Third, despite the dot-product attention mechanism has been introduced and harnessed widely in semantic segmentation to model long-range dependencies, the high time and space complexities of attention impede the usage of attention in application scenarios with large input. In this paper, we proposed a Multi-Attention-Network (MANet) to remedy these drawbacks, which extracts contextual dependencies by multi efficient attention mechanisms. A novel attention mechanism named kernel attention with linear complexity is proposed to alleviate the high computational demand of attention. Based on kernel attention and channel attention, we integrate local feature maps extracted by ResNeXt-101 with their corresponding global dependencies, and adaptively signalize interdependent channel maps. Experiments conducted on two remote sensing image datasets captured by variant satellites demonstrate that the performance of our MANet transcends the DeepLab V3+, PSPNet, FastFCN, and other baseline algorithms.

* arXiv admin note: substantial text overlap with arXiv:2007.14902

Via

Access Paper or Ask Questions

Linear Attention Mechanism: An Efficient Attention for Semantic Segmentation

Aug 20, 2020

Rui Li, Jianlin Su, Chenxi Duan, Shunyi Zheng

Figure 1 for Linear Attention Mechanism: An Efficient Attention for Semantic Segmentation

Abstract:In this paper, to remedy this deficiency, we propose a Linear Attention Mechanism which is approximate to dot-product attention with much less memory and computational costs. The efficient design makes the incorporation between attention mechanisms and neural networks more flexible and versatile. Experiments conducted on semantic segmentation demonstrated the effectiveness of linear attention mechanism. Code is available at https://github.com/lironui/Linear-Attention-Mechanism.

Via

Access Paper or Ask Questions

Land Cover Classification from Remote Sensing Images Based on Multi-Scale Fully Convolutional Network

Aug 01, 2020

Rui Li, Shunyi Zheng, Chenxi Duan

Figure 1 for Land Cover Classification from Remote Sensing Images Based on Multi-Scale Fully Convolutional Network

Figure 2 for Land Cover Classification from Remote Sensing Images Based on Multi-Scale Fully Convolutional Network

Figure 3 for Land Cover Classification from Remote Sensing Images Based on Multi-Scale Fully Convolutional Network

Figure 4 for Land Cover Classification from Remote Sensing Images Based on Multi-Scale Fully Convolutional Network

Abstract:In this paper, a Multi-Scale Fully Convolutional Network (MSFCN) with multi-scale convolutional kernel is proposed to exploit discriminative representations from two-dimensional (2D) satellite images.

Via

Access Paper or Ask Questions

MACU-Net Semantic Segmentation from High-Resolution Remote Sensing Images

Jul 26, 2020

Rui Li, Chenxi Duan, Shunyi Zheng

Figure 1 for MACU-Net Semantic Segmentation from High-Resolution Remote Sensing Images

Figure 2 for MACU-Net Semantic Segmentation from High-Resolution Remote Sensing Images

Figure 3 for MACU-Net Semantic Segmentation from High-Resolution Remote Sensing Images

Figure 4 for MACU-Net Semantic Segmentation from High-Resolution Remote Sensing Images

Abstract:Semantic segmentation of remote sensing images plays an important role in land resource management, yield estimation, and economic assessment. U-Net is a sophisticated encoder-decoder architecture which has been frequently used in medical image segmentation and has attained prominent performance. And asymmetric convolution block can enhance the square convolution kernels using asymmetric convolutions. In this paper, based on U-Net and asymmetric convolution block, we incorporate multi-scale features generated by different layers of U-Net and design a multi-scale skip connected architecture, MACU-Net, for semantic segmentation using high-resolution remote sensing images. Our design has the following advantages: (1) The multi-scale skip connections combine and realign semantic features contained both in low-level and high-level feature maps with different scales; (2) the asymmetric convolution block strengthens the representational capacity of a standard convolution layer. Experiments conducted on two remote sensing image datasets captured by separate satellites demonstrate that the performance of our MACU-Net transcends the U-Net, SegNet, DeepLab V3+, and other baseline algorithms.

Via

Access Paper or Ask Questions