Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alsu Sagirova

Complexity of Symbolic Representation in Working Memory of Transformer Correlates with the Complexity of a Task

Jun 20, 2024

Alsu Sagirova, Mikhail Burtsev

Figure 1 for Complexity of Symbolic Representation in Working Memory of Transformer Correlates with the Complexity of a Task

Figure 2 for Complexity of Symbolic Representation in Working Memory of Transformer Correlates with the Complexity of a Task

Figure 3 for Complexity of Symbolic Representation in Working Memory of Transformer Correlates with the Complexity of a Task

Figure 4 for Complexity of Symbolic Representation in Working Memory of Transformer Correlates with the Complexity of a Task

Abstract:Even though Transformers are extensively used for Natural Language Processing tasks, especially for machine translation, they lack an explicit memory to store key concepts of processed texts. This paper explores the properties of the content of symbolic working memory added to the Transformer model decoder. Such working memory enhances the quality of model predictions in machine translation task and works as a neural-symbolic representation of information that is important for the model to make correct translations. The study of memory content revealed that translated text keywords are stored in the working memory, pointing to the relevance of memory content to the processed text. Also, the diversity of tokens and parts of speech stored in memory correlates with the complexity of the corpora for machine translation task.

* Cognitive Systems Research, Volume 75, 2022, Pages 16-24, ISSN 1389-0417
* 18 pages, 6 figures. Published in the journal Cognitive Systems Research 3 June 2022: https://www.sciencedirect.com/science/article/abs/pii/S1389041722000274

Via

Access Paper or Ask Questions

Uncertainty Guided Global Memory Improves Multi-Hop Question Answering

Nov 29, 2023

Alsu Sagirova, Mikhail Burtsev

Figure 1 for Uncertainty Guided Global Memory Improves Multi-Hop Question Answering

Figure 2 for Uncertainty Guided Global Memory Improves Multi-Hop Question Answering

Figure 3 for Uncertainty Guided Global Memory Improves Multi-Hop Question Answering

Figure 4 for Uncertainty Guided Global Memory Improves Multi-Hop Question Answering

Abstract:Transformers have become the gold standard for many natural language processing tasks and, in particular, for multi-hop question answering (MHQA). This task includes processing a long document and reasoning over the multiple parts of it. The landscape of MHQA approaches can be classified into two primary categories. The first group focuses on extracting supporting evidence, thereby constraining the QA model's context to predicted facts. Conversely, the second group relies on the attention mechanism of the long input encoding model to facilitate multi-hop reasoning. However, attention-based token representations lack explicit global contextual information to connect reasoning steps. To address these issues, we propose GEMFormer, a two-stage method that first collects relevant information over the entire document to the memory and then combines it with local context to solve the task. Our experimental results show that fine-tuning a pre-trained model with memory-augmented input, including the most certain global elements, improves the model's performance on three MHQA datasets compared to the baseline. We also found that the global explicit memory contains information from supporting facts required for the correct answer.

* 12 pages, 7 figures. EMNLP 2023. Our code is available at https://github.com/Aloriosa/GEMFormer

Via

Access Paper or Ask Questions