Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models

Sep 07, 2024

Michael Günther, Isabelle Mohr, Bo Wang, Han Xiao

Share this with someone who'll enjoy it:

Abstract:Many use cases require retrieving smaller portions of text, and dense vector-based retrieval systems often perform better with shorter text segments, as the semantics are less likely to be "over-compressed" in the embeddings. Consequently, practitioners often split text documents into smaller chunks and encode them separately. However, chunk embeddings created in this way can lose contextual information from surrounding chunks, resulting in suboptimal representations. In this paper, we introduce a novel method called "late chunking," which leverages long context embedding models to first embed all tokens of the long text, with chunking applied after the transformer model and just before mean pooling. The resulting chunk embeddings capture the full contextual information, leading to superior results across various retrieval tasks without the need for additional training. Moreover, our method is generic enough to be applied to any long-context embedding model.

* 4 pages, early draft

View paper on

Share this with someone who'll enjoy it:

Title:Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models

Paper and Code