Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jacob Parnell

SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization

Mar 20, 2024

Jacob Parnell, Inigo Jauregi Unanue, Massimo Piccardi

Figure 1 for SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization

Figure 2 for SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization

Figure 3 for SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization

Figure 4 for SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization

Abstract:Cross-lingual summarization (XLS) generates summaries in a language different from that of the input documents (e.g., English to Spanish), allowing speakers of the target language to gain a concise view of their content. In the present day, the predominant approach to this task is to take a performing, pretrained multilingual language model (LM) and fine-tune it for XLS on the language pairs of interest. However, the scarcity of fine-tuning samples makes this approach challenging in some cases. For this reason, in this paper we propose revisiting the summarize-and-translate pipeline, where the summarization and translation tasks are performed in a sequence. This approach allows reusing the many, publicly-available resources for monolingual summarization and translation, obtaining a very competitive zero-shot performance. In addition, the proposed pipeline is completely differentiable end-to-end, allowing it to take advantage of few-shot fine-tuning, where available. Experiments over two contemporary and widely adopted XLS datasets (CrossSum and WikiLingua) have shown the remarkable zero-shot performance of the proposed approach, and also its strong few-shot performance compared to an equivalent multilingual LM baseline, that the proposed approach has been able to outperform in many languages with only 10% of the fine-tuning samples.

* Accepted to NAACL 2024

Via

Access Paper or Ask Questions

A Multi-Document Coverage Reward for RELAXed Multi-Document Summarization

Mar 06, 2022

Jacob Parnell, Inigo Jauregi Unanue, Massimo Piccardi

Figure 1 for A Multi-Document Coverage Reward for RELAXed Multi-Document Summarization

Figure 2 for A Multi-Document Coverage Reward for RELAXed Multi-Document Summarization

Figure 3 for A Multi-Document Coverage Reward for RELAXed Multi-Document Summarization

Figure 4 for A Multi-Document Coverage Reward for RELAXed Multi-Document Summarization

Abstract:Multi-document summarization (MDS) has made significant progress in recent years, in part facilitated by the availability of new, dedicated datasets and capacious language models. However, a standing limitation of these models is that they are trained against limited references and with plain maximum-likelihood objectives. As for many other generative tasks, reinforcement learning (RL) offers the potential to improve the training of MDS models; yet, it requires a carefully-designed reward that can ensure appropriate leverage of both the reference summaries and the input documents. For this reason, in this paper we propose fine-tuning an MDS baseline with a reward that balances a reference-based metric such as ROUGE with coverage of the input documents. To implement the approach, we utilize RELAX (Grathwohl et al., 2018), a contemporary gradient estimator which is both low-variance and unbiased, and we fine-tune the baseline in a few-shot style for both stability and computational efficiency. Experimental results over the Multi-News and WCEP MDS datasets show significant improvements of up to +0.95 pp average ROUGE score and +3.17 pp METEOR score over the baseline, and competitive results with the literature. In addition, they show that the coverage of the input documents is increased, and evenly across all documents.

* Accepted to ACL 2022

Via

Access Paper or Ask Questions

RewardsOfSum: Exploring Reinforcement Learning Rewards for Summarisation

Jun 08, 2021

Jacob Parnell, Inigo Jauregi Unanue, Massimo Piccardi

Figure 1 for RewardsOfSum: Exploring Reinforcement Learning Rewards for Summarisation

Figure 2 for RewardsOfSum: Exploring Reinforcement Learning Rewards for Summarisation

Figure 3 for RewardsOfSum: Exploring Reinforcement Learning Rewards for Summarisation

Figure 4 for RewardsOfSum: Exploring Reinforcement Learning Rewards for Summarisation

Abstract:To date, most abstractive summarisation models have relied on variants of the negative log-likelihood (NLL) as their training objective. In some cases, reinforcement learning has been added to train the models with an objective that is closer to their evaluation measures (e.g. ROUGE). However, the reward function to be used within the reinforcement learning approach can play a key role for performance and is still partially unexplored. For this reason, in this paper, we propose two reward functions for the task of abstractive summarisation: the first function, referred to as RwB-Hinge, dynamically selects the samples for the gradient update. The second function, nicknamed RISK, leverages a small pool of strong candidates to inform the reward. In the experiments, we probe the proposed approach by fine-tuning an NLL pre trained model over nine summarisation datasets of diverse size and nature. The experimental results show a consistent improvement over the negative log-likelihood baselines.

* 5th Workshop on Structured Prediction for NLP; held in conjunction with ACL-IJCNLP 2021

Via

Access Paper or Ask Questions

BERTTune: Fine-Tuning Neural Machine Translation with BERTScore

Jun 04, 2021

Inigo Jauregi Unanue, Jacob Parnell, Massimo Piccardi

Figure 1 for BERTTune: Fine-Tuning Neural Machine Translation with BERTScore

Figure 2 for BERTTune: Fine-Tuning Neural Machine Translation with BERTScore

Figure 3 for BERTTune: Fine-Tuning Neural Machine Translation with BERTScore

Figure 4 for BERTTune: Fine-Tuning Neural Machine Translation with BERTScore

Abstract:Neural machine translation models are often biased toward the limited translation references seen during training. To amend this form of overfitting, in this paper we propose fine-tuning the models with a novel training objective based on the recently-proposed BERTScore evaluation metric. BERTScore is a scoring function based on contextual embeddings that overcomes the typical limitations of n-gram-based metrics (e.g. synonyms, paraphrases), allowing translations that are different from the references, yet close in the contextual embedding space, to be treated as substantially correct. To be able to use BERTScore as a training objective, we propose three approaches for generating soft predictions, allowing the network to remain completely differentiable end-to-end. Experiments carried out over four, diverse language pairs have achieved improvements of up to 0.58 pp (3.28%) in BLEU score and up to 0.76 pp (0.98%) in BERTScore (F_BERT) when fine-tuning a strong baseline.

* Accepted at ACL 2021

Via

Access Paper or Ask Questions