Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Aaron Phillips

LAReQA: Language-agnostic answer retrieval from a multilingual pool

Apr 11, 2020

Uma Roy, Noah Constant, Rami Al-Rfou, Aditya Barua, Aaron Phillips, Yinfei Yang

Figure 1 for LAReQA: Language-agnostic answer retrieval from a multilingual pool

Figure 2 for LAReQA: Language-agnostic answer retrieval from a multilingual pool

Figure 3 for LAReQA: Language-agnostic answer retrieval from a multilingual pool

Figure 4 for LAReQA: Language-agnostic answer retrieval from a multilingual pool

Abstract:We present LAReQA, a challenging new benchmark for language-agnostic answer retrieval from a multilingual candidate pool. Unlike previous cross-lingual tasks, LAReQA tests for "strong" cross-lingual alignment, requiring semantically related cross-language pairs to be closer in representation space than unrelated same-language pairs. Building on multilingual BERT (mBERT), we study different strategies for achieving strong alignment. We find that augmenting training data via machine translation is effective, and improves significantly over using mBERT out-of-the-box. Interestingly, the embedding baseline that performs the best on LAReQA falls short of competing baselines on zero-shot variants of our task that only target "weak" alignment. This finding underscores our claim that languageagnostic retrieval is a substantively new kind of cross-lingual evaluation.

Via

Access Paper or Ask Questions