Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Moonsu Han

Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation

Oct 06, 2020

Minki Kang, Moonsu Han, Sung Ju Hwang

Figure 1 for Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation

Figure 2 for Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation

Figure 3 for Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation

Figure 4 for Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation

Abstract:We propose a method to automatically generate a domain- and task-adaptive maskings of the given text for self-supervised pre-training, such that we can effectively adapt the language model to a particular target task (e.g. question answering). Specifically, we present a novel reinforcement learning-based framework which learns the masking policy, such that using the generated masks for further pre-training of the target language model helps improve task performance on unseen texts. We use off-policy actor-critic with entropy regularization and experience replay for reinforcement learning, and propose a Transformer-based policy network that can consider the relative importance of words in a given text. We validate our Neural Mask Generator (NMG) on several question answering and text classification datasets using BERT and DistilBERT as the language models, on which it outperforms rule-based masking strategies, by automatically learning optimal adaptive maskings.

* 19 pages, 9 figures, EMNLP 2020

Via

Access Paper or Ask Questions

Episodic Memory Reader: Learning What to Remember for Question Answering from Streaming Data

Mar 18, 2019

Moonsu Han, Minki Kang, Hyunwoo Jung, Sung Ju Hwang

Figure 1 for Episodic Memory Reader: Learning What to Remember for Question Answering from Streaming Data

Figure 2 for Episodic Memory Reader: Learning What to Remember for Question Answering from Streaming Data

Figure 3 for Episodic Memory Reader: Learning What to Remember for Question Answering from Streaming Data

Figure 4 for Episodic Memory Reader: Learning What to Remember for Question Answering from Streaming Data

Abstract:We consider a novel question answering (QA) task where the machine needs to read from large streaming data (long documents or videos) without knowing when the questions will be given, in which case the existing QA methods fail due to lack of scalability. To tackle this problem, we propose a novel end-to-end reading comprehension method, which we refer to as Episodic Memory Reader (EMR) that sequentially reads the input contexts into an external memory, while replacing memories that are less important for answering unseen questions. Specifically, we train an RL agent to replace a memory entry when the memory is full in order to maximize its QA accuracy at a future timepoint, while encoding the external memory using the transformer architecture to learn representations that considers relative importance between the memory entries. We validate our model on a real-world large-scale textual QA task (TriviaQA) and a video QA task (TVQA), on which it achieves significant improvements over rule-based memory scheduling policies or an RL-based baseline that learns the query-specific importance of each memory independently.

* 14 pages, 15 figures

Via

Access Paper or Ask Questions

Learning What to Remember: Long-term Episodic Memory Networks for Learning from Streaming Data

Dec 11, 2018

Hyunwoo Jung, Moonsu Han, Minki Kang, Sungju Hwang

Figure 1 for Learning What to Remember: Long-term Episodic Memory Networks for Learning from Streaming Data

Figure 2 for Learning What to Remember: Long-term Episodic Memory Networks for Learning from Streaming Data

Figure 3 for Learning What to Remember: Long-term Episodic Memory Networks for Learning from Streaming Data

Figure 4 for Learning What to Remember: Long-term Episodic Memory Networks for Learning from Streaming Data

Abstract:Current generation of memory-augmented neural networks has limited scalability as they cannot efficiently process data that are too large to fit in the external memory storage. One example of this is lifelong learning scenario where the model receives unlimited length of data stream as an input which contains vast majority of uninformative entries. We tackle this problem by proposing a memory network fit for long-term lifelong learning scenario, which we refer to as Long-term Episodic Memory Networks (LEMN), that features a RNN-based retention agent that learns to replace less important memory entries based on the retention probability generated on each entry that is learned to identify data instances of generic importance relative to other memory entries, as well as its historical importance. Such learning of retention agent allows our long-term episodic memory network to retain memory entries of generic importance for a given task. We validate our model on a path-finding task as well as synthetic and real question answering tasks, on which our model achieves significant improvements over the memory augmented networks with rule-based memory scheduling as well as an RL-based baseline that does not consider relative or historical importance of the memory.

Via

Access Paper or Ask Questions