Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:FOR: Finetuning for Object Level Open Vocabulary Image Retrieval

Dec 25, 2024

Hila Levi, Guy Heller, Dan Levi

Figure 1 for FOR: Finetuning for Object Level Open Vocabulary Image Retrieval

Figure 2 for FOR: Finetuning for Object Level Open Vocabulary Image Retrieval

Figure 3 for FOR: Finetuning for Object Level Open Vocabulary Image Retrieval

Figure 4 for FOR: Finetuning for Object Level Open Vocabulary Image Retrieval

Share this with someone who'll enjoy it:

Abstract:As working with large datasets becomes standard, the task of accurately retrieving images containing objects of interest by an open set textual query gains practical importance. The current leading approach utilizes a pre-trained CLIP model without any adaptation to the target domain, balancing accuracy and efficiency through additional post-processing. In this work, we propose FOR: Finetuning for Object-centric Open-vocabulary Image Retrieval, which allows finetuning on a target dataset using closed-set labels while keeping the visual-language association crucial for open vocabulary retrieval. FOR is based on two design elements: a specialized decoder variant of the CLIP head customized for the intended task, and its coupling within a multi-objective training framework. Together, these design choices result in a significant increase in accuracy, showcasing improvements of up to 8 mAP@50 points over SoTA across three datasets. Additionally, we demonstrate that FOR is also effective in a semi-supervised setting, achieving impressive results even when only a small portion of the dataset is labeled.

* WACV 2025

View paper on

Share this with someone who'll enjoy it:

Title:FOR: Finetuning for Object Level Open Vocabulary Image Retrieval

Paper and Code