Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Anna Currey

MemInsight: Autonomous Memory Augmentation for LLM Agents

Mar 27, 2025

Rana Salama, Jason Cai, Michelle Yuan, Anna Currey, Monica Sunkara, Yi Zhang, Yassine Benajiba

Figure 1 for MemInsight: Autonomous Memory Augmentation for LLM Agents

Figure 2 for MemInsight: Autonomous Memory Augmentation for LLM Agents

Figure 3 for MemInsight: Autonomous Memory Augmentation for LLM Agents

Figure 4 for MemInsight: Autonomous Memory Augmentation for LLM Agents

Abstract:Large language model (LLM) agents have evolved to intelligently process information, make decisions, and interact with users or tools. A key capability is the integration of long-term memory capabilities, enabling these agents to draw upon historical interactions and knowledge. However, the growing memory size and need for semantic structuring pose significant challenges. In this work, we propose an autonomous memory augmentation approach, MemInsight, to enhance semantic data representation and retrieval mechanisms. By leveraging autonomous augmentation to historical interactions, LLM agents are shown to deliver more accurate and contextualized responses. We empirically validate the efficacy of our proposed approach in three task scenarios; conversational recommendation, question answering and event summarization. On the LLM-REDIAL dataset, MemInsight boosts persuasiveness of recommendations by up to 14%. Moreover, it outperforms a RAG baseline by 34% in recall for LoCoMo retrieval. Our empirical results show the potential of MemInsight to enhance the contextual performance of LLM agents across multiple tasks.

Via

Access Paper or Ask Questions

Fine-Tuned Machine Translation Metrics Struggle in Unseen Domains

Feb 28, 2024

Vilém Zouhar, Shuoyang Ding, Anna Currey, Tatyana Badeka, Jenyuan Wang, Brian Thompson

Figure 1 for Fine-Tuned Machine Translation Metrics Struggle in Unseen Domains

Figure 2 for Fine-Tuned Machine Translation Metrics Struggle in Unseen Domains

Figure 3 for Fine-Tuned Machine Translation Metrics Struggle in Unseen Domains

Figure 4 for Fine-Tuned Machine Translation Metrics Struggle in Unseen Domains

Abstract:We introduce a new, extensive multidimensional quality metrics (MQM) annotated dataset covering 11 language pairs in the biomedical domain. We use this dataset to investigate whether machine translation (MT) metrics which are fine-tuned on human-generated MT quality judgements are robust to domain shifts between training and inference. We find that fine-tuned metrics exhibit a substantial performance drop in the unseen domain scenario relative to metrics that rely on the surface form, as well as pre-trained metrics which are not fine-tuned on MT quality judgments.

Via

Access Paper or Ask Questions

RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation

May 26, 2023

Gabriele Sarti, Phu Mon Htut, Xing Niu, Benjamin Hsu, Anna Currey, Georgiana Dinu, Maria Nadejde

Figure 1 for RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation

Figure 2 for RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation

Figure 3 for RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation

Figure 4 for RAMP: Retrieval and Attribute-Marking Enhanced Prompting for Attribute-Controlled Translation

Abstract:Attribute-controlled translation (ACT) is a subtask of machine translation that involves controlling stylistic or linguistic attributes (like formality and gender) of translation outputs. While ACT has garnered attention in recent years due to its usefulness in real-world applications, progress in the task is currently limited by dataset availability, since most prior approaches rely on supervised methods. To address this limitation, we propose Retrieval and Attribute-Marking enhanced Prompting (RAMP), which leverages large multilingual language models to perform ACT in few-shot and zero-shot settings. RAMP improves generation accuracy over the standard prompting approach by (1) incorporating a semantic similarity retrieval component for selecting similar in-context examples, and (2) marking in-context examples with attribute annotations. Our comprehensive experiments show that RAMP is a viable approach in both zero-shot and few-shot settings.

* Accepted at ACL 2023

Via

Access Paper or Ask Questions

Pseudo-Label Training and Model Inertia in Neural Machine Translation

May 19, 2023

Benjamin Hsu, Anna Currey, Xing Niu, Maria Nădejde, Georgiana Dinu

Figure 1 for Pseudo-Label Training and Model Inertia in Neural Machine Translation

Figure 2 for Pseudo-Label Training and Model Inertia in Neural Machine Translation

Figure 3 for Pseudo-Label Training and Model Inertia in Neural Machine Translation

Figure 4 for Pseudo-Label Training and Model Inertia in Neural Machine Translation

Abstract:Like many other machine learning applications, neural machine translation (NMT) benefits from over-parameterized deep neural models. However, these models have been observed to be brittle: NMT model predictions are sensitive to small input changes and can show significant variation across re-training or incremental model updates. This work studies a frequently used method in NMT, pseudo-label training (PLT), which is common to the related techniques of forward-translation (or self-training) and sequence-level knowledge distillation. While the effect of PLT on quality is well-documented, we highlight a lesser-known effect: PLT can enhance a model's stability to model updates and input perturbations, a set of properties we call model inertia. We study inertia effects under different training settings and we identify distribution simplification as a mechanism behind the observed results.

* accepted ICLR 2023

Via

Access Paper or Ask Questions

MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation

Nov 02, 2022

Anna Currey, Maria Nădejde, Raghavendra Pappagari, Mia Mayer, Stanislas Lauly, Xing Niu, Benjamin Hsu, Georgiana Dinu

Figure 1 for MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation

Figure 2 for MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation

Figure 3 for MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation

Figure 4 for MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation

Abstract:As generic machine translation (MT) quality has improved, the need for targeted benchmarks that explore fine-grained aspects of quality has increased. In particular, gender accuracy in translation can have implications in terms of output fluency, translation accuracy, and ethics. In this paper, we introduce MT-GenEval, a benchmark for evaluating gender accuracy in translation from English into eight widely-spoken languages. MT-GenEval complements existing benchmarks by providing realistic, gender-balanced, counterfactual data in eight language pairs where the gender of individuals is unambiguous in the input segment, including multi-sentence segments requiring inter-sentential gender agreement. Our data and code is publicly available under a CC BY SA 3.0 license.

* Accepted at EMNLP 2022. Data and code: https://github.com/amazon-research/machine-translation-gender-eval

Via

Access Paper or Ask Questions

Sockeye 3: Fast Neural Machine Translation with PyTorch

Jul 12, 2022

Felix Hieber, Michael Denkowski, Tobias Domhan, Barbara Darques Barros, Celina Dong Ye, Xing Niu, Cuong Hoang, Ke Tran, Benjamin Hsu, Maria Nadejde(+4 more)

Figure 1 for Sockeye 3: Fast Neural Machine Translation with PyTorch

Figure 2 for Sockeye 3: Fast Neural Machine Translation with PyTorch

Figure 3 for Sockeye 3: Fast Neural Machine Translation with PyTorch

Figure 4 for Sockeye 3: Fast Neural Machine Translation with PyTorch

Abstract:Sockeye 3 is the latest version of the Sockeye toolkit for Neural Machine Translation (NMT). Now based on PyTorch, Sockeye 3 provides faster model implementations and more advanced features with a further streamlined codebase. This enables broader experimentation with faster iteration, efficient training of stronger and faster models, and the flexibility to move new ideas quickly from research to production. When running comparable models, Sockeye 3 is up to 126% faster than other PyTorch implementations on GPUs and up to 292% faster on CPUs. Sockeye 3 is open source software released under the Apache 2.0 license.

Via

Access Paper or Ask Questions

CoCoA-MT: A Dataset and Benchmark for Contrastive Controlled MT with Application to Formality

May 09, 2022

Maria Nădejde, Anna Currey, Benjamin Hsu, Xing Niu, Marcello Federico, Georgiana Dinu

Figure 1 for CoCoA-MT: A Dataset and Benchmark for Contrastive Controlled MT with Application to Formality

Figure 2 for CoCoA-MT: A Dataset and Benchmark for Contrastive Controlled MT with Application to Formality

Figure 3 for CoCoA-MT: A Dataset and Benchmark for Contrastive Controlled MT with Application to Formality

Figure 4 for CoCoA-MT: A Dataset and Benchmark for Contrastive Controlled MT with Application to Formality

Abstract:The machine translation (MT) task is typically formulated as that of returning a single translation for an input segment. However, in many cases, multiple different translations are valid and the appropriate translation may depend on the intended target audience, characteristics of the speaker, or even the relationship between speakers. Specific problems arise when dealing with honorifics, particularly translating from English into languages with formality markers. For example, the sentence "Are you sure?" can be translated in German as "Sind Sie sich sicher?" (formal register) or "Bist du dir sicher?" (informal). Using wrong or inconsistent tone may be perceived as inappropriate or jarring for users of certain cultures and demographics. This work addresses the problem of learning to control target language attributes, in this case formality, from a small amount of labeled contrastive data. We introduce an annotated dataset (CoCoA-MT) and an associated evaluation metric for training and evaluating formality-controlled MT models for six diverse target languages. We show that we can train formality-controlled models by fine-tuning on labeled contrastive data, achieving high accuracy (82% in-domain and 73% out-of-domain) while maintaining overall quality.

* NAACL 2022

Via

Access Paper or Ask Questions

Faithful Target Attribute Prediction in Neural Machine Translation

Sep 24, 2021

Xing Niu, Georgiana Dinu, Prashant Mathur, Anna Currey

Figure 1 for Faithful Target Attribute Prediction in Neural Machine Translation

Figure 2 for Faithful Target Attribute Prediction in Neural Machine Translation

Figure 3 for Faithful Target Attribute Prediction in Neural Machine Translation

Figure 4 for Faithful Target Attribute Prediction in Neural Machine Translation

Abstract:The training data used in NMT is rarely controlled with respect to specific attributes, such as word casing or gender, which can cause errors in translations. We argue that predicting the target word and attributes simultaneously is an effective way to ensure that translations are more faithful to the training data distribution with respect to these attributes. Experimental results on two tasks, uppercased input translation and gender prediction, show that this strategy helps mirror the training data distribution in testing. It also facilitates data augmentation on the task of uppercased input translation.

* Withdrawn from Findings of ACL 2021

Via

Access Paper or Ask Questions

Improving Gender Translation Accuracy with Filtered Self-Training

Apr 15, 2021

Prafulla Kumar Choubey, Anna Currey, Prashant Mathur, Georgiana Dinu

Figure 1 for Improving Gender Translation Accuracy with Filtered Self-Training

Figure 2 for Improving Gender Translation Accuracy with Filtered Self-Training

Figure 3 for Improving Gender Translation Accuracy with Filtered Self-Training

Figure 4 for Improving Gender Translation Accuracy with Filtered Self-Training

Abstract:Targeted evaluations have found that machine translation systems often output incorrect gender, even when the gender is clear from context. Furthermore, these incorrectly gendered translations have the potential to reflect or amplify social biases. We propose a gender-filtered self-training technique to improve gender translation accuracy on unambiguously gendered inputs. This approach uses a source monolingual corpus and an initial model to generate gender-specific pseudo-parallel corpora which are then added to the training data. We filter the gender-specific corpora on the source and target sides to ensure that sentence pairs contain and correctly translate the specified gender. We evaluate our approach on translation from English into five languages, finding that our models improve gender translation accuracy without any cost to generic translation quality. In addition, we show the viability of our approach on several settings, including re-training from scratch, fine-tuning, controlling the balance of the training data, forward translation, and back-translation.

Via

Access Paper or Ask Questions

Multi-Source Syntactic Neural Machine Translation

Aug 30, 2018

Anna Currey, Kenneth Heafield

Figure 1 for Multi-Source Syntactic Neural Machine Translation

Figure 2 for Multi-Source Syntactic Neural Machine Translation

Figure 3 for Multi-Source Syntactic Neural Machine Translation

Figure 4 for Multi-Source Syntactic Neural Machine Translation

Abstract:We introduce a novel multi-source technique for incorporating source syntax into neural machine translation using linearized parses. This is achieved by employing separate encoders for the sequential and parsed versions of the same source sentence; the resulting representations are then combined using a hierarchical attention mechanism. The proposed model improves over both seq2seq and parsed baselines by over 1 BLEU on the WMT17 English-German task. Further analysis shows that our multi-source syntactic model is able to translate successfully without any parsed input, unlike standard parsed methods. In addition, performance does not deteriorate as much on long sentences as for the baselines.

* EMNLP 2018

Via

Access Paper or Ask Questions