Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Composed Multi-modal Retrieval: A Survey of Approaches and Applications

Mar 03, 2025

Kun Zhang, Jingyu Li, Zhe Li, Jingjing Zhang

Figure 1 for Composed Multi-modal Retrieval: A Survey of Approaches and Applications

Figure 2 for Composed Multi-modal Retrieval: A Survey of Approaches and Applications

Figure 3 for Composed Multi-modal Retrieval: A Survey of Approaches and Applications

Figure 4 for Composed Multi-modal Retrieval: A Survey of Approaches and Applications

Share this with someone who'll enjoy it:

Abstract:With the rapid growth of multi-modal data from social media, short video platforms, and e-commerce, content-based retrieval has become essential for efficiently searching and utilizing heterogeneous information. Over time, retrieval techniques have evolved from Unimodal Retrieval (UR) to Cross-modal Retrieval (CR) and, more recently, to Composed Multi-modal Retrieval (CMR). CMR enables users to retrieve images or videos by integrating a reference visual input with textual modifications, enhancing search flexibility and precision. This paper provides a comprehensive review of CMR, covering its fundamental challenges, technical advancements, and categorization into supervised, zero-shot, and semi-supervised learning paradigms. We discuss key research directions, including data augmentation, model architecture, and loss optimization in supervised CMR, as well as transformation frameworks and external knowledge integration in zero-shot CMR. Additionally, we highlight the application potential of CMR in composed image retrieval, video retrieval, and person retrieval, which have significant implications for e-commerce, online search, and public security. Given its ability to refine and personalize search experiences, CMR is poised to become a pivotal technology in next-generation retrieval systems. A curated list of related works and resources is available at: https://github.com/kkzhang95/Awesome-Composed-Multi-modal-Retrieval

View paper on

Share this with someone who'll enjoy it:

Title:Composed Multi-modal Retrieval: A Survey of Approaches and Applications

Paper and Code