Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hangfei Li

Self-distilled Dynamic Fusion Network for Language-based Fashion Retrieval

May 24, 2024

Yiming Wu, Hangfei Li, Fangfang Wang, Yilong Zhang, Ronghua Liang

Figure 1 for Self-distilled Dynamic Fusion Network for Language-based Fashion Retrieval

Figure 2 for Self-distilled Dynamic Fusion Network for Language-based Fashion Retrieval

Figure 3 for Self-distilled Dynamic Fusion Network for Language-based Fashion Retrieval

Figure 4 for Self-distilled Dynamic Fusion Network for Language-based Fashion Retrieval

Abstract:In the domain of language-based fashion image retrieval, pinpointing the desired fashion item using both a reference image and its accompanying textual description is an intriguing challenge. Existing approaches lean heavily on static fusion techniques, intertwining image and text. Despite their commendable advancements, these approaches are still limited by a deficiency in flexibility. In response, we propose a Self-distilled Dynamic Fusion Network to compose the multi-granularity features dynamically by considering the consistency of routing path and modality-specific information simultaneously. Two new modules are included in our proposed method: (1) Dynamic Fusion Network with Modality Specific Routers. The dynamic network enables a flexible determination of the routing for each reference image and modification text, taking into account their distinct semantics and distributions. (2) Self Path Distillation Loss. A stable path decision for queries benefits the optimization of feature extraction as well as routing, and we approach this by progressively refine the path decision with previous path information. Extensive experiments demonstrate the effectiveness of our proposed model compared to existing methods.

* ICASSP 2024

Via

Access Paper or Ask Questions