Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Cristina Giannone

MeMo: Towards Language Models with Associative Memory Mechanisms

Feb 18, 2025

Fabio Massimo Zanzotto, Elena Sofia Ruzzetti, Giancarlo A. Xompero, Leonardo Ranaldi, Davide Venditti, Federico Ranaldi, Cristina Giannone, Andrea Favalli, Raniero Romagnoli

Figure 1 for MeMo: Towards Language Models with Associative Memory Mechanisms

Figure 2 for MeMo: Towards Language Models with Associative Memory Mechanisms

Figure 3 for MeMo: Towards Language Models with Associative Memory Mechanisms

Figure 4 for MeMo: Towards Language Models with Associative Memory Mechanisms

Abstract:Memorization is a fundamental ability of Transformer-based Large Language Models, achieved through learning. In this paper, we propose a paradigm shift by designing an architecture to memorize text directly, bearing in mind the principle that memorization precedes learning. We introduce MeMo, a novel architecture for language modeling that explicitly memorizes sequences of tokens in layered associative memories. By design, MeMo offers transparency and the possibility of model editing, including forgetting texts. We experimented with the MeMo architecture, showing the memorization power of the one-layer and the multi-layer configurations.

Via

Access Paper or Ask Questions

Enhancing Data Privacy in Large Language Models through Private Association Editing

Jun 26, 2024

Davide Venditti, Elena Sofia Ruzzetti, Giancarlo A. Xompero, Cristina Giannone, Andrea Favalli, Raniero Romagnoli, Fabio Massimo Zanzotto

Figure 1 for Enhancing Data Privacy in Large Language Models through Private Association Editing

Figure 2 for Enhancing Data Privacy in Large Language Models through Private Association Editing

Figure 3 for Enhancing Data Privacy in Large Language Models through Private Association Editing

Figure 4 for Enhancing Data Privacy in Large Language Models through Private Association Editing

Abstract:Large Language Models (LLMs) are powerful tools with extensive applications, but their tendency to memorize private information raises significant concerns as private data leakage can easily happen. In this paper, we introduce Private Association Editing (PAE), a novel defense approach for private data leakage. PAE is designed to effectively remove Personally Identifiable Information (PII) without retraining the model. Our approach consists of a four-step procedure: detecting memorized PII, applying PAE cards to mitigate memorization of private data, verifying resilience to targeted data extraction (TDE) attacks, and ensuring consistency in the post-edit LLMs. The versatility and efficiency of PAE, which allows for batch modifications, significantly enhance data privacy in LLMs. Experimental results demonstrate the effectiveness of PAE in mitigating private data leakage. We believe PAE will serve as a critical tool in the ongoing effort to protect data privacy in LLMs, encouraging the development of safer models for real-world applications.

Via

Access Paper or Ask Questions

Investigating the Impact of Data Contamination of Large Language Models in Text-to-SQL Translation

Feb 12, 2024

Federico Ranaldi, Elena Sofia Ruzzetti, Dario Onorati, Leonardo Ranaldi, Cristina Giannone, Andrea Favalli, Raniero Romagnoli, Fabio Massimo Zanzotto

Figure 1 for Investigating the Impact of Data Contamination of Large Language Models in Text-to-SQL Translation

Figure 2 for Investigating the Impact of Data Contamination of Large Language Models in Text-to-SQL Translation

Figure 3 for Investigating the Impact of Data Contamination of Large Language Models in Text-to-SQL Translation

Figure 4 for Investigating the Impact of Data Contamination of Large Language Models in Text-to-SQL Translation

Abstract:Understanding textual description to generate code seems to be an achieved capability of instruction-following Large Language Models (LLMs) in zero-shot scenario. However, there is a severe possibility that this translation ability may be influenced by having seen target textual descriptions and the related code. This effect is known as Data Contamination. In this study, we investigate the impact of Data Contamination on the performance of GPT-3.5 in the Text-to-SQL code-generating tasks. Hence, we introduce a novel method to detect Data Contamination in GPTs and examine GPT-3.5's Text-to-SQL performances using the known Spider Dataset and our new unfamiliar dataset Termite. Furthermore, we analyze GPT-3.5's efficacy on databases with modified information via an adversarial table disconnection (ATD) approach, complicating Text-to-SQL tasks by removing structural pieces of information from the database. Our results indicate a significant performance drop in GPT-3.5 on the unfamiliar Termite dataset, even with ATD modifications, highlighting the effect of Data Contamination on LLMs in Text-to-SQL translation tasks.

Via

Access Paper or Ask Questions

Every time I fire a conversational designer, the performance of the dialog system goes down

Sep 27, 2021

Giancarlo A. Xompero, Michele Mastromattei, Samir Salman, Cristina Giannone, Andrea Favalli, Raniero Romagnoli, Fabio Massimo Zanzotto

Figure 1 for Every time I fire a conversational designer, the performance of the dialog system goes down

Figure 2 for Every time I fire a conversational designer, the performance of the dialog system goes down

Figure 3 for Every time I fire a conversational designer, the performance of the dialog system goes down

Figure 4 for Every time I fire a conversational designer, the performance of the dialog system goes down

Abstract:Incorporating explicit domain knowledge into neural-based task-oriented dialogue systems is an effective way to reduce the need of large sets of annotated dialogues. In this paper, we investigate how the use of explicit domain knowledge of conversational designers affects the performance of neural-based dialogue systems. To support this investigation, we propose the Conversational-Logic-Injection-in-Neural-Network system (CLINN) where explicit knowledge is coded in semi-logical rules. By using CLINN, we evaluated semi-logical rules produced by a team of differently skilled conversational designers. We experimented with the Restaurant topic of the MultiWOZ dataset. Results show that external knowledge is extremely important for reducing the need of annotated examples for conversational systems. In fact, rules from conversational designers used in CLINN significantly outperform a state-of-the-art neural-based dialogue system.

Via

Access Paper or Ask Questions