Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs

Apr 16, 2024

Woomin Song, Seunghyuk Oh, Sangwoo Mo, Jaehyung Kim, Sukmin Yun, Jung-Woo Ha, Jinwoo Shin

Figure 1 for Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs

Figure 2 for Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs

Figure 3 for Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs

Figure 4 for Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs

Share this with someone who'll enjoy it:

Abstract:Large language models (LLMs) have shown remarkable performance in various natural language processing tasks. However, a primary constraint they face is the context limit, i.e., the maximum number of tokens they can process. Previous works have explored architectural changes and modifications in positional encoding to relax the constraint, but they often require expensive training or do not address the computational demands of self-attention. In this paper, we present Hierarchical cOntext MERging (HOMER), a new training-free scheme designed to overcome the limitations. HOMER uses a divide-and-conquer algorithm, dividing long inputs into manageable chunks. Each chunk is then processed collectively, employing a hierarchical strategy that merges adjacent chunks at progressive transformer layers. A token reduction technique precedes each merging, ensuring memory usage efficiency. We also propose an optimized computational order reducing the memory requirement to logarithmically scale with respect to input length, making it especially favorable for environments with tight memory restrictions. Our experiments demonstrate the proposed method's superior performance and memory efficiency, enabling the broader use of LLMs in contexts requiring extended context. Code is available at https://github.com/alinlab/HOMER.

* Accepted to ICLR 2024. The first two authors contributed equally

View paper on

Share this with someone who'll enjoy it:

Title:Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs

Paper and Code