Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nelson Liu

Dyna-bAbI: unlocking bAbI's potential with dynamic synthetic benchmarking

Nov 30, 2021

Ronen Tamari, Kyle Richardson, Aviad Sar-Shalom, Noam Kahlon, Nelson Liu, Reut Tsarfaty, Dafna Shahaf

Figure 1 for Dyna-bAbI: unlocking bAbI's potential with dynamic synthetic benchmarking

Figure 2 for Dyna-bAbI: unlocking bAbI's potential with dynamic synthetic benchmarking

Figure 3 for Dyna-bAbI: unlocking bAbI's potential with dynamic synthetic benchmarking

Figure 4 for Dyna-bAbI: unlocking bAbI's potential with dynamic synthetic benchmarking

Abstract:While neural language models often perform surprisingly well on natural language understanding (NLU) tasks, their strengths and limitations remain poorly understood. Controlled synthetic tasks are thus an increasingly important resource for diagnosing model behavior. In this work we focus on story understanding, a core competency for NLU systems. However, the main synthetic resource for story understanding, the bAbI benchmark, lacks such a systematic mechanism for controllable task generation. We develop Dyna-bAbI, a dynamic framework providing fine-grained control over task generation in bAbI. We demonstrate our ideas by constructing three new tasks requiring compositional generalization, an important evaluation setting absent from the original benchmark. We tested both special-purpose models developed for bAbI as well as state-of-the-art pre-trained methods, and found that while both approaches solve the original tasks (>99% accuracy), neither approach succeeded in the compositional generalization setting, indicating the limitations of the original training data. We explored ways to augment the original data, and found that though diversifying training data was far more useful than simply increasing dataset size, it was still insufficient for driving robust compositional generalization (with <70% accuracy for complex compositions). Our results underscore the importance of highly controllable task generators for creating robust NLU systems through a virtuous cycle of model and data development.

* Code and data will be made available at project page: https://tiny.one/8wjxwd7z

Via

Access Paper or Ask Questions

AllenNLP: A Deep Semantic Natural Language Processing Platform

May 31, 2018

Matt Gardner, Joel Grus, Mark Neumann, Oyvind Tafjord, Pradeep Dasigi, Nelson Liu, Matthew Peters, Michael Schmitz, Luke Zettlemoyer

Abstract:This paper describes AllenNLP, a platform for research on deep learning methods in natural language understanding. AllenNLP is designed to support researchers who want to build novel language understanding models quickly and easily. It is built on top of PyTorch, allowing for dynamic computation graphs, and provides (1) a flexible data API that handles intelligent batching and padding, (2) high-level abstractions for common operations in working with text, and (3) a modular and extensible experiment framework that makes doing good science easy. It also includes reference implementations of high quality approaches for both core semantic problems (e.g. semantic role labeling (Palmer et al., 2005)) and language understanding applications (e.g. machine comprehension (Rajpurkar et al., 2016)). AllenNLP is an ongoing open-source effort maintained by engineers and researchers at the Allen Institute for Artificial Intelligence.

* Describes the initial version of AllenNLP. Many features and models have been added since the first release. This is the paper to cite if you use AllenNLP in your research. Updated 5/31/2018 with version accepted to the NLP OSS workshop help at ACL 2018

Via

Access Paper or Ask Questions