Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:A Reality Check on Context Utilisation for Retrieval-Augmented Generation

Dec 22, 2024

Lovisa Hagström, Sara Vera Marjanović, Haeun Yu, Arnav Arora, Christina Lioma, Maria Maistro, Pepa Atanasova, Isabelle Augenstein

Figure 1 for A Reality Check on Context Utilisation for Retrieval-Augmented Generation

Figure 2 for A Reality Check on Context Utilisation for Retrieval-Augmented Generation

Figure 3 for A Reality Check on Context Utilisation for Retrieval-Augmented Generation

Figure 4 for A Reality Check on Context Utilisation for Retrieval-Augmented Generation

Share this with someone who'll enjoy it:

Abstract:Retrieval-augmented generation (RAG) helps address the limitations of the parametric knowledge embedded within a language model (LM). However, investigations of how LMs utilise retrieved information of varying complexity in real-world scenarios have been limited to synthetic contexts. We introduce DRUID (Dataset of Retrieved Unreliable, Insufficient and Difficult-to-understand contexts) with real-world queries and contexts manually annotated for stance. The dataset is based on the prototypical task of automated claim verification, for which automated retrieval of real-world evidence is crucial. We compare DRUID to synthetic datasets (CounterFact, ConflictQA) and find that artificial datasets often fail to represent the complex and diverse real-world context settings. We show that synthetic datasets exaggerate context characteristics rare in real retrieved data, which leads to inflated context utilisation results, as measured by our novel ACU score. Moreover, while previous work has mainly focused on singleton context characteristics to explain context utilisation, correlations between singleton context properties and ACU on DRUID are surprisingly small compared to other properties related to context source. Overall, our work underscores the need for real-world aligned context utilisation studies to represent and improve performance in real-world RAG settings.

* 43 pages, 18 figures

View paper on

Share this with someone who'll enjoy it:

Title:A Reality Check on Context Utilisation for Retrieval-Augmented Generation

Paper and Code