Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Pinpoint, Not Criticize: Refining Large Language Models via Fine-Grained Actionable Feedback

Nov 15, 2023

Wenda Xu, Daniel Deutsch, Mara Finkelstein, Juraj Juraska, Biao Zhang, Zhongtao Liu, William Yang Wang, Lei Li, Markus Freitag

Figure 1 for Pinpoint, Not Criticize: Refining Large Language Models via Fine-Grained Actionable Feedback

Figure 2 for Pinpoint, Not Criticize: Refining Large Language Models via Fine-Grained Actionable Feedback

Figure 3 for Pinpoint, Not Criticize: Refining Large Language Models via Fine-Grained Actionable Feedback

Figure 4 for Pinpoint, Not Criticize: Refining Large Language Models via Fine-Grained Actionable Feedback

Share this with someone who'll enjoy it:

Abstract:Recent improvements in text generation have leveraged human feedback to improve the quality of the generated output. However, human feedback is not always available, especially during inference. In this work, we propose an inference time optimization method FITO to use fine-grained actionable feedback in the form of error type, error location and severity level that are predicted by a learned error pinpoint model for iterative refinement. FITO starts with an initial output, then iteratively incorporates the feedback via a refinement model that generates an improved output conditioned on the feedback. Given the uncertainty of consistent refined samples at iterative steps, we formulate iterative refinement into a local search problem and develop a simulated annealing based algorithm that balances exploration of the search space and optimization for output quality. We conduct experiments on three text generation tasks, including machine translation, long-form question answering (QA) and topical summarization. We observe 0.8 and 0.7 MetricX gain on Chinese-English and English-German translation, 4.5 and 1.8 ROUGE-L gain at long form QA and topic summarization respectively, with a single iteration of refinement. With our simulated annealing algorithm, we see further quality improvements, including up to 1.7 MetricX improvements over the baseline approach.

View paper on

Share this with someone who'll enjoy it:

Title:Pinpoint, Not Criticize: Refining Large Language Models via Fine-Grained Actionable Feedback

Paper and Code