Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Discovering Language-neutral Sub-networks in Multilingual Language Models

May 25, 2022

Negar Foroutan, Mohammadreza Banaei, Remi Lebret, Antoine Bosselut, Karl Aberer

Figure 1 for Discovering Language-neutral Sub-networks in Multilingual Language Models

Figure 2 for Discovering Language-neutral Sub-networks in Multilingual Language Models

Figure 3 for Discovering Language-neutral Sub-networks in Multilingual Language Models

Figure 4 for Discovering Language-neutral Sub-networks in Multilingual Language Models

Share this with someone who'll enjoy it:

Abstract:Multilingual pre-trained language models perform remarkably well on cross-lingual transfer for downstream tasks. Despite their impressive performance, our understanding of their language neutrality (i.e., the extent to which they use shared representations to encode similar phenomena across languages) and its role in achieving such performance remain open questions. In this work, we conceptualize language neutrality of multilingual models as a function of the overlap between language-encoding sub-networks of these models. Using mBERT as a foundation, we employ the lottery ticket hypothesis to discover sub-networks that are individually optimized for various languages and tasks. Using three distinct tasks and eleven typologically-diverse languages in our evaluation, we show that the sub-networks found for different languages are in fact quite similar, supporting the idea that mBERT jointly encodes multiple languages in shared parameters. We conclude that mBERT is comprised of a language-neutral sub-network shared among many languages, along with multiple ancillary language-specific sub-networks, with the former playing a more prominent role in mBERT's impressive cross-lingual performance.

View paper on

Share this with someone who'll enjoy it:

Title:Discovering Language-neutral Sub-networks in Multilingual Language Models

Paper and Code