Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Generative Retrieval with Semantic Tree-Structured Item Identifiers via Contrastive Learning

Sep 23, 2023

Zihua Si, Zhongxiang Sun, Jiale Chen, Guozhang Chen, Xiaoxue Zang, Kai Zheng, Yang Song, Xiao Zhang, Jun Xu

Figure 1 for Generative Retrieval with Semantic Tree-Structured Item Identifiers via Contrastive Learning

Figure 2 for Generative Retrieval with Semantic Tree-Structured Item Identifiers via Contrastive Learning

Figure 3 for Generative Retrieval with Semantic Tree-Structured Item Identifiers via Contrastive Learning

Figure 4 for Generative Retrieval with Semantic Tree-Structured Item Identifiers via Contrastive Learning

Share this with someone who'll enjoy it:

Abstract:The retrieval phase is a vital component in recommendation systems, requiring the model to be effective and efficient. Recently, generative retrieval has become an emerging paradigm for document retrieval, showing notable performance. These methods enjoy merits like being end-to-end differentiable, suggesting their viability in recommendation. However, these methods fall short in efficiency and effectiveness for large-scale recommendations. To obtain efficiency and effectiveness, this paper introduces a generative retrieval framework, namely SEATER, which learns SEmAntic Tree-structured item identifiERs via contrastive learning. Specifically, we employ an encoder-decoder model to extract user interests from historical behaviors and retrieve candidates via tree-structured item identifiers. SEATER devises a balanced k-ary tree structure of item identifiers, allocating semantic space to each token individually. This strategy maintains semantic consistency within the same level, while distinct levels correlate to varying semantic granularities. This structure also maintains consistent and fast inference speed for all items. Considering the tree structure, SEATER learns identifier tokens' semantics, hierarchical relationships, and inter-token dependencies. To achieve this, we incorporate two contrastive learning tasks with the generation task to optimize both the model and identifiers. The infoNCE loss aligns the token embeddings based on their hierarchical positions. The triplet loss ranks similar identifiers in desired orders. In this way, SEATER achieves both efficiency and effectiveness. Extensive experiments on three public datasets and an industrial dataset have demonstrated that SEATER outperforms state-of-the-art models significantly.

* 8 main pages, 3 pages for appendix

View paper on

Share this with someone who'll enjoy it:

Title:Generative Retrieval with Semantic Tree-Structured Item Identifiers via Contrastive Learning

Paper and Code