Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sang-eun Han

On Monotonic Aggregation for Open-domain QA

Aug 08, 2023

Sang-eun Han, Yeonseok Jeong, Seung-won Hwang, Kyungjae Lee

Abstract:Question answering (QA) is a critical task for speech-based retrieval from knowledge sources, by sifting only the answers without requiring to read supporting documents. Specifically, open-domain QA aims to answer user questions on unrestricted knowledge sources. Ideally, adding a source should not decrease the accuracy, but we find this property (denoted as "monotonicity") does not hold for current state-of-the-art methods. We identify the cause, and based on that we propose Judge-Specialist framework. Our framework consists of (1) specialist retrievers/readers to cover individual sources, and (2) judge, a dedicated language model to select the final answer. Our experiments show that our framework not only ensures monotonicity, but also outperforms state-of-the-art multi-source QA methods on Natural Questions. Additionally, we show that our models robustly preserve the monotonicity against noise from speech recognition. We publicly release our code and setting.

* INTERSPEECH 2023 Camera Ready

Via

Access Paper or Ask Questions

When to Read Documents or QA History: On Unified and Selective Open-domain QA

Jun 07, 2023

Kyungjae Lee, Sang-eun Han, Seung-won Hwang, Moontae Lee

Abstract:This paper studies the problem of open-domain question answering, with the aim of answering a diverse range of questions leveraging knowledge resources. Two types of sources, QA-pair and document corpora, have been actively leveraged with the following complementary strength. The former is highly precise when the paraphrase of given question $q$ was seen and answered during training, often posed as a retrieval problem, while the latter generalizes better for unseen questions. A natural follow-up is thus leveraging both models, while a naive pipelining or integration approaches have failed to bring additional gains over either model alone. Our distinction is interpreting the problem as calibration, which estimates the confidence of predicted answers as an indicator to decide when to use a document or QA-pair corpus. The effectiveness of our method was validated on widely adopted benchmarks such as Natural Questions and TriviaQA.

* Findings of ACL 2023 camera ready

Via

Access Paper or Ask Questions

Robustifying Multi-hop QA through Pseudo-Evidentiality Training

Jul 07, 2021

Kyungjae Lee, Seung-won Hwang, Sang-eun Han, Dohyeon Lee

Figure 1 for Robustifying Multi-hop QA through Pseudo-Evidentiality Training

Figure 2 for Robustifying Multi-hop QA through Pseudo-Evidentiality Training

Figure 3 for Robustifying Multi-hop QA through Pseudo-Evidentiality Training

Figure 4 for Robustifying Multi-hop QA through Pseudo-Evidentiality Training

Abstract:This paper studies the bias problem of multi-hop question answering models, of answering correctly without correct reasoning. One way to robustify these models is by supervising to not only answer right, but also with right reasoning chains. An existing direction is to annotate reasoning chains to train models, requiring expensive additional annotations. In contrast, we propose a new approach to learn evidentiality, deciding whether the answer prediction is supported by correct evidences, without such annotations. Instead, we compare counterfactual changes in answer confidence with and without evidence sentences, to generate "pseudo-evidentiality" annotations. We validate our proposed model on an original set and challenge set in HotpotQA, showing that our method is accurate and robust in multi-hop reasoning.

* Accepted to ACL2021 (main conference)

Via

Access Paper or Ask Questions