Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:AgMTR: Agent Mining Transformer for Few-shot Segmentation in Remote Sensing

Sep 26, 2024

Hanbo Bi, Yingchao Feng, Yongqiang Mao, Jianning Pei, Wenhui Diao, Hongqi Wang, Xian Sun

Figure 1 for AgMTR: Agent Mining Transformer for Few-shot Segmentation in Remote Sensing

Figure 2 for AgMTR: Agent Mining Transformer for Few-shot Segmentation in Remote Sensing

Figure 3 for AgMTR: Agent Mining Transformer for Few-shot Segmentation in Remote Sensing

Figure 4 for AgMTR: Agent Mining Transformer for Few-shot Segmentation in Remote Sensing

Share this with someone who'll enjoy it:

Abstract:Few-shot Segmentation (FSS) aims to segment the interested objects in the query image with just a handful of labeled samples (i.e., support images). Previous schemes would leverage the similarity between support-query pixel pairs to construct the pixel-level semantic correlation. However, in remote sensing scenarios with extreme intra-class variations and cluttered backgrounds, such pixel-level correlations may produce tremendous mismatches, resulting in semantic ambiguity between the query foreground (FG) and background (BG) pixels. To tackle this problem, we propose a novel Agent Mining Transformer (AgMTR), which adaptively mines a set of local-aware agents to construct agent-level semantic correlation. Compared with pixel-level semantics, the given agents are equipped with local-contextual information and possess a broader receptive field. At this point, different query pixels can selectively aggregate the fine-grained local semantics of different agents, thereby enhancing the semantic clarity between query FG and BG pixels. Concretely, the Agent Learning Encoder (ALE) is first proposed to erect the optimal transport plan that arranges different agents to aggregate support semantics under different local regions. Then, for further optimizing the agents, the Agent Aggregation Decoder (AAD) and the Semantic Alignment Decoder (SAD) are constructed to break through the limited support set for mining valuable class-specific semantics from unlabeled data sources and the query image itself, respectively. Extensive experiments on the remote sensing benchmark iSAID indicate that the proposed method achieves state-of-the-art performance. Surprisingly, our method remains quite competitive when extended to more common natural scenarios, i.e., PASCAL-5i and COCO-20i.

* accepted to IJCV

View paper on

Share this with someone who'll enjoy it:

Title:AgMTR: Agent Mining Transformer for Few-shot Segmentation in Remote Sensing

Paper and Code