Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pradyot Prakash

Dynamic Strategy Planning for Efficient Question Answering with Large Language Models

Oct 30, 2024

Tanmay Parekh, Pradyot Prakash, Alexander Radovic, Akshay Shekher, Denis Savenkov

Figure 1 for Dynamic Strategy Planning for Efficient Question Answering with Large Language Models

Figure 2 for Dynamic Strategy Planning for Efficient Question Answering with Large Language Models

Figure 3 for Dynamic Strategy Planning for Efficient Question Answering with Large Language Models

Figure 4 for Dynamic Strategy Planning for Efficient Question Answering with Large Language Models

Abstract:Research has shown the effectiveness of reasoning (e.g., Chain-of-Thought), planning (e.g., SelfAsk), and retrieval augmented generation strategies to improve the performance of Large Language Models (LLMs) on various tasks, such as question answering. However, using a single fixed strategy to answer different kinds of questions is suboptimal in performance and inefficient in terms of generated output tokens and performed retrievals. In our work, we propose a novel technique DyPlan, to induce a dynamic strategy selection process in LLMs, to improve performance and reduce costs in question-answering. DyPlan incorporates an initial decision step to select the most suitable strategy conditioned on the input question and guides the LLM's response generation accordingly. We extend DyPlan to DyPlan-verify, adding an internal verification and correction process to further enrich the generated answer. Experiments on three prominent multi-hop question answering (MHQA) datasets reveal how DyPlan can improve model performance by 7-13% while reducing the cost by 11-32% relative to the best baseline model.

* Under review at ACL Rolling Review

Via

Access Paper or Ask Questions

Improving Model Factuality with Fine-grained Critique-based Evaluator

Oct 24, 2024

Yiqing Xie, Wenxuan Zhou, Pradyot Prakash, Di Jin, Yuning Mao, Quintin Fettes, Arya Talebzadeh, Sinong Wang, Han Fang, Carolyn Rose(+2 more)

Abstract:Factuality evaluation aims to detect factual errors produced by language models (LMs) and hence guide the development of more factual models. Towards this goal, we train a factuality evaluator, FenCE, that provides LM generators with claim-level factuality feedback. We conduct data augmentation on a combination of public judgment datasets to train FenCE to (1) generate textual critiques along with scores and (2) make claim-level judgment based on diverse source documents obtained by various tools. We then present a framework that leverages FenCE to improve the factuality of LM generators by constructing training data. Specifically, we generate a set of candidate responses, leverage FenCE to revise and score each response without introducing lesser-known facts, and train the generator by preferring highly scored revised responses. Experiments show that our data augmentation methods improve the evaluator's accuracy by 2.9% on LLM-AggreFact. With FenCE, we improve Llama3-8B-chat's factuality rate by 14.45% on FActScore, outperforming state-of-the-art factuality finetuning methods by 6.96%.

Via

Access Paper or Ask Questions

Utilizing Lexical Similarity between Related, Low-resource Languages for Pivot-based SMT

Oct 04, 2017

Anoop Kunchukuttan, Maulik Shah, Pradyot Prakash, Pushpak Bhattacharyya

Figure 1 for Utilizing Lexical Similarity between Related, Low-resource Languages for Pivot-based SMT

Figure 2 for Utilizing Lexical Similarity between Related, Low-resource Languages for Pivot-based SMT

Figure 3 for Utilizing Lexical Similarity between Related, Low-resource Languages for Pivot-based SMT

Figure 4 for Utilizing Lexical Similarity between Related, Low-resource Languages for Pivot-based SMT

Abstract:We investigate pivot-based translation between related languages in a low resource, phrase-based SMT setting. We show that a subword-level pivot-based SMT model using a related pivot language is substantially better than word and morpheme-level pivot models. It is also highly competitive with the best direct translation model, which is encouraging as no direct source-target training corpus is used. We also show that combining multiple related language pivot models can rival a direct translation model. Thus, the use of subwords as translation units coupled with multiple related pivot languages can compensate for the lack of a direct parallel corpus.

* Accepted at IJCNLP 2017, 7 pages, 7 tables

Via

Access Paper or Ask Questions