Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:ReMamber: Referring Image Segmentation with Mamba Twister

Mar 26, 2024

Yuhuan Yang, Chaofan Ma, Jiangchao Yao, Zhun Zhong, Ya Zhang, Yanfeng Wang

Figure 1 for ReMamber: Referring Image Segmentation with Mamba Twister

Figure 2 for ReMamber: Referring Image Segmentation with Mamba Twister

Figure 3 for ReMamber: Referring Image Segmentation with Mamba Twister

Figure 4 for ReMamber: Referring Image Segmentation with Mamba Twister

Share this with someone who'll enjoy it:

Abstract:Referring Image Segmentation (RIS) leveraging transformers has achieved great success on the interpretation of complex visual-language tasks. However, the quadratic computation cost makes it resource-consuming in capturing long-range visual-language dependencies. Fortunately, Mamba addresses this with efficient linear complexity in processing. However, directly applying Mamba to multi-modal interactions presents challenges, primarily due to inadequate channel interactions for the effective fusion of multi-modal data. In this paper, we propose ReMamber, a novel RIS architecture that integrates the power of Mamba with a multi-modal Mamba Twister block. The Mamba Twister explicitly models image-text interaction, and fuses textual and visual features through its unique channel and spatial twisting mechanism. We achieve the state-of-the-art on three challenging benchmarks. Moreover, we conduct thorough analyses of ReMamber and discuss other fusion designs using Mamba. These provide valuable perspectives for future research.

View paper on

Share this with someone who'll enjoy it:

Title:ReMamber: Referring Image Segmentation with Mamba Twister

Paper and Code