Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Multilingual Machine Translation with Hyper-Adapters

May 22, 2022

Christos Baziotis, Mikel Artetxe, James Cross, Shruti Bhosale

Figure 1 for Multilingual Machine Translation with Hyper-Adapters

Figure 2 for Multilingual Machine Translation with Hyper-Adapters

Figure 3 for Multilingual Machine Translation with Hyper-Adapters

Figure 4 for Multilingual Machine Translation with Hyper-Adapters

Share this with someone who'll enjoy it:

Abstract:Multilingual machine translation suffers from negative interference across languages. A common solution is to relax parameter sharing with language-specific modules like adapters. However, adapters of related languages are unable to transfer information, and their total number of parameters becomes prohibitively expensive as the number of languages grows. In this work, we overcome these drawbacks using hyper-adapters -- hyper-networks that generate adapters from language and layer embeddings. While past work had poor results when scaling hyper-networks, we propose a rescaling fix that significantly improves convergence and enables training larger hyper-networks. We find that hyper-adapters are more parameter efficient than regular adapters, reaching the same performance with up to 12 times less parameters. When using the same number of parameters and FLOPS, our approach consistently outperforms regular adapters. Also, hyper-adapters converge faster than alternative approaches and scale better than regular dense networks. Our analysis shows that hyper-adapters learn to encode language relatedness, enabling positive transfer across languages.

View paper on

Share this with someone who'll enjoy it:

Title:Multilingual Machine Translation with Hyper-Adapters

Paper and Code