Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Krunal Shah

What do we expect from Multiple-choice QA Systems?

Nov 20, 2020

Krunal Shah, Nitish Gupta, Dan Roth

Figure 1 for What do we expect from Multiple-choice QA Systems?

Figure 2 for What do we expect from Multiple-choice QA Systems?

Figure 3 for What do we expect from Multiple-choice QA Systems?

Figure 4 for What do we expect from Multiple-choice QA Systems?

Abstract:The recent success of machine learning systems on various QA datasets could be interpreted as a significant improvement in models' language understanding abilities. However, using various perturbations, multiple recent works have shown that good performance on a dataset might not indicate performance that correlates well with human's expectations from models that "understand" language. In this work we consider a top performing model on several Multiple Choice Question Answering (MCQA) datasets, and evaluate it against a set of expectations one might have from such a model, using a series of zero-information perturbations of the model's inputs. Our results show that the model clearly falls short of our expectations, and motivates a modified training approach that forces the model to better attend to the inputs. We show that the new training paradigm leads to a model that performs on par with the original model while better satisfying our expectations.

* Findings of the Association for Computational Linguistics: EMNLP 2020 pg. 3547-3553
* Findings of EMNLP 2020

Via

Access Paper or Ask Questions

Large Scale Question Answering using Tourism Data

Sep 08, 2019

Danish Contractor, Krunal Shah, Aditi Partap, Mausam, Parag Singla

Figure 1 for Large Scale Question Answering using Tourism Data

Figure 2 for Large Scale Question Answering using Tourism Data

Figure 3 for Large Scale Question Answering using Tourism Data

Figure 4 for Large Scale Question Answering using Tourism Data

Abstract:Real world question answering can be significantly more complex than what most existing QA datasets reflect. Questions posed by users on websites, such as online travel forums, may consist of multiple sentences and not everything mentioned in a question may be relevant for finding its answer. Such questions typically have a huge candidate answer space and require complex reasoning over large knowledge corpora. We introduce the novel task of answering entity-seeking recommendation questions using a collection of reviews that describe candidate answer entities. We harvest a QA dataset that contains 48,147 paragraph-sized real user questions from travelers seeking recommendations for hotels, attractions and restaurants. Each candidate answer is associated with a collection of unstructured reviews. This dataset is challenging because commonly used neural architectures for QA are prohibitively expensive for a task of this scale. As a solution, we design a scalable cluster-select-rerank approach. It first clusters text for each entity to identify exemplar sentences describing an entity. It then uses a scalable neural information retrieval (IR) module to subselect a set of potential entities from the large candidate set. A reranker uses a deeper attention-based architecture to pick the best answers from the selected entities. This strategy performs better than a pure IR or a pure attention-based reasoning approach yielding nearly 10% relative improvement in Accuracy@3 over both approaches.

* 11 pages, 3 figures, 4 Tables

Via

Access Paper or Ask Questions