Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Johan Chagnon

The Power of Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval

Nov 18, 2021

Minghan Li, Diana Nicoleta Popa, Johan Chagnon, Yagmur Gizem Cinar, Eric Gaussier

Figure 1 for The Power of Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval

Figure 2 for The Power of Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval

Figure 3 for The Power of Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval

Figure 4 for The Power of Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval

Abstract:On a wide range of natural language processing and information retrieval tasks, transformer-based models, particularly pre-trained language models like BERT, have demonstrated tremendous effectiveness. Due to the quadratic complexity of the self-attention mechanism, however, such models have difficulties processing long documents. Recent works dealing with this issue include truncating long documents, segmenting them into passages that can be treated by a standard BERT model, or modifying the self-attention mechanism to make it sparser as in sparse-attention models. However, these approaches either lose information or have high computational complexity (and are both time, memory and energy consuming in this later case). We follow here a slightly different approach in which one first selects key blocks of a long document by local query-block pre-ranking, and then few blocks are aggregated to form a short document that can be processed by a model such as BERT. Experiments conducted on standard Information Retrieval datasets demonstrate the effectiveness of the proposed approach.

Via

Access Paper or Ask Questions