Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Principled Paraphrase Generation with Parallel Corpora

May 24, 2022

Aitor Ormazabal, Mikel Artetxe, Gorka Labaka, Aitor Soroa, Eneko Agirre

Figure 1 for Principled Paraphrase Generation with Parallel Corpora

Figure 2 for Principled Paraphrase Generation with Parallel Corpora

Figure 3 for Principled Paraphrase Generation with Parallel Corpora

Figure 4 for Principled Paraphrase Generation with Parallel Corpora

Share this with someone who'll enjoy it:

Abstract:Round-trip Machine Translation (MT) is a popular choice for paraphrase generation, which leverages readily available parallel corpora for supervision. In this paper, we formalize the implicit similarity function induced by this approach, and show that it is susceptible to non-paraphrase pairs sharing a single ambiguous translation. Based on these insights, we design an alternative similarity metric that mitigates this issue by requiring the entire translation distribution to match, and implement a relaxation of it through the Information Bottleneck method. Our approach incorporates an adversarial term into MT training in order to learn representations that encode as much information about the reference translation as possible, while keeping as little information about the input as possible. Paraphrases can be generated by decoding back to the source from this representation, without having to generate pivot translations. In addition to being more principled and efficient than round-trip MT, our approach offers an adjustable parameter to control the fidelity-diversity trade-off, and obtains better results in our experiments.

* ACL 2022

View paper on

Share this with someone who'll enjoy it:

Title:Principled Paraphrase Generation with Parallel Corpora

Paper and Code