Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Extrapolating Multilingual Understanding Models as Multilingual Generators

May 22, 2023

Bohong Wu, Fei Yuan, Hai Zhao, Lei Li, Jingjing Xu

Figure 1 for Extrapolating Multilingual Understanding Models as Multilingual Generators

Figure 2 for Extrapolating Multilingual Understanding Models as Multilingual Generators

Figure 3 for Extrapolating Multilingual Understanding Models as Multilingual Generators

Figure 4 for Extrapolating Multilingual Understanding Models as Multilingual Generators

Share this with someone who'll enjoy it:

Abstract:Multilingual understanding models (or encoder-based), pre-trained via masked language modeling, have achieved promising results on many language understanding tasks (e.g., mBERT). However, these non-autoregressive (NAR) models still struggle to generate high-quality texts compared with autoregressive (AR) models. Considering that encoder-based models have the advantage of efficient generation and self-correction abilities, this paper explores methods to empower multilingual understanding models the generation abilities to get a unified model. Specifically, we start from a multilingual encoder (XLM-R) and propose a \textbf{S}emantic-\textbf{G}uided \textbf{A}lignment-then-Denoising (SGA) approach to adapt an encoder to a multilingual generator with a small number of new parameters. Experiments show that the proposed approach is an effective adaption method, outperforming widely-used initialization-based methods with gains of 9.4 BLEU on machine translation, 8.1 Rouge-L on question generation, and 5.5 METEOR on story generation on XLM-R$_{large}$. On the other hand, we observe that XLM-R is still inferior to mBART in supervised settings despite better results on zero-shot settings, indicating that more exploration is required to make understanding models strong generators.

View paper on

Share this with someone who'll enjoy it:

Title:Extrapolating Multilingual Understanding Models as Multilingual Generators

Paper and Code