Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models

Feb 23, 2024

Stefan Hegselmann, Shannon Zejiang Shen, Florian Gierse, Monica Agrawal, David Sontag, Xiaoyi Jiang

Figure 1 for A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models

Figure 2 for A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models

Figure 3 for A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models

Figure 4 for A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models

Share this with someone who'll enjoy it:

Abstract:Patients often face difficulties in understanding their hospitalizations, while healthcare workers have limited resources to provide explanations. In this work, we investigate the potential of large language models to generate patient summaries based on doctors' notes and study the effect of training data on the faithfulness and quality of the generated summaries. To this end, we develop a rigorous labeling protocol for hallucinations, and have two medical experts annotate 100 real-world summaries and 100 generated summaries. We show that fine-tuning on hallucination-free data effectively reduces hallucinations from 2.60 to 1.55 per summary for Llama 2, while preserving relevant information. Although the effect is still present, it is much smaller for GPT-4 when prompted with five examples (0.70 to 0.40). We also conduct a qualitative evaluation using hallucination-free and improved training data. GPT-4 shows very good results even in the zero-shot setting. We find that common quantitative metrics do not correlate well with faithfulness and quality. Finally, we test GPT-4 for automatic hallucination detection, which yields promising results.

View paper on

Share this with someone who'll enjoy it:

Title:A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models

Paper and Code