Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:FineCIR: Explicit Parsing of Fine-Grained Modification Semantics for Composed Image Retrieval

Mar 27, 2025

Zixu Li, Zhiheng Fu, Yupeng Hu, Zhiwei Chen, Haokun Wen, Liqiang Nie

Figure 1 for FineCIR: Explicit Parsing of Fine-Grained Modification Semantics for Composed Image Retrieval

Figure 2 for FineCIR: Explicit Parsing of Fine-Grained Modification Semantics for Composed Image Retrieval

Figure 3 for FineCIR: Explicit Parsing of Fine-Grained Modification Semantics for Composed Image Retrieval

Figure 4 for FineCIR: Explicit Parsing of Fine-Grained Modification Semantics for Composed Image Retrieval

Share this with someone who'll enjoy it:

Abstract:Composed Image Retrieval (CIR) facilitates image retrieval through a multimodal query consisting of a reference image and modification text. The reference image defines the retrieval context, while the modification text specifies desired alterations. However, existing CIR datasets predominantly employ coarse-grained modification text (CoarseMT), which inadequately captures fine-grained retrieval intents. This limitation introduces two key challenges: (1) ignoring detailed differences leads to imprecise positive samples, and (2) greater ambiguity arises when retrieving visually similar images. These issues degrade retrieval accuracy, necessitating manual result filtering or repeated queries. To address these limitations, we develop a robust fine-grained CIR data annotation pipeline that minimizes imprecise positive samples and enhances CIR systems' ability to discern modification intents accurately. Using this pipeline, we refine the FashionIQ and CIRR datasets to create two fine-grained CIR datasets: Fine-FashionIQ and Fine-CIRR. Furthermore, we introduce FineCIR, the first CIR framework explicitly designed to parse the modification text. FineCIR effectively captures fine-grained modification semantics and aligns them with ambiguous visual entities, enhancing retrieval precision. Extensive experiments demonstrate that FineCIR consistently outperforms state-of-the-art CIR baselines on both fine-grained and traditional CIR benchmark datasets. Our FineCIR code and fine-grained CIR datasets are available at https://github.com/SDU-L/FineCIR.git.

View paper on

Share this with someone who'll enjoy it:

Title:FineCIR: Explicit Parsing of Fine-Grained Modification Semantics for Composed Image Retrieval

Paper and Code