Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Knowing When to Stop: Dynamic Context Cutoff for Large Language Models

Feb 03, 2025

Roy Xie, Junlin Wang, Paul Rosu, Chunyuan Deng, Bolun Sun, Zihao Lin, Bhuwan Dhingra

Figure 1 for Knowing When to Stop: Dynamic Context Cutoff for Large Language Models

Figure 2 for Knowing When to Stop: Dynamic Context Cutoff for Large Language Models

Figure 3 for Knowing When to Stop: Dynamic Context Cutoff for Large Language Models

Figure 4 for Knowing When to Stop: Dynamic Context Cutoff for Large Language Models

Share this with someone who'll enjoy it:

Abstract:Large language models (LLMs) process entire input contexts indiscriminately, which is inefficient in cases where the information required to answer a query is localized within the context. We present dynamic context cutoff, a human-inspired method enabling LLMs to self-terminate processing upon acquiring sufficient task-relevant information. Through analysis of model internals, we discover that specific attention heads inherently encode "sufficiency signals" - detectable through lightweight classifiers - that predict when critical information has been processed. This reveals a new efficiency paradigm: models' internal understanding naturally dictates processing needs rather than external compression heuristics. Comprehensive experiments across six QA datasets (up to 40K tokens) with three model families (LLaMA/Qwen/Mistral, 1B0-70B) demonstrate 1.33x average token reduction while improving accuracy by 1.3%. Furthermore, our method demonstrates better performance with the same rate of token reduction compared to other context efficiency methods. Additionally, we observe an emergent scaling phenomenon: while smaller models require require probing for sufficiency detection, larger models exhibit intrinsic self-assessment capabilities through prompting.

* Project Website: https://royxie.com/when-to-stop-project/

View paper on

Share this with someone who'll enjoy it:

Title:Knowing When to Stop: Dynamic Context Cutoff for Large Language Models

Paper and Code