Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:BERT WEAVER: Using WEight AVERaging to Enable Lifelong Learning for Transformer-based Models

Feb 21, 2022

Lisa Langnickel, Alexander Schulz, Barbara Hammer, Juliane Fluck

Figure 1 for BERT WEAVER: Using WEight AVERaging to Enable Lifelong Learning for Transformer-based Models

Figure 2 for BERT WEAVER: Using WEight AVERaging to Enable Lifelong Learning for Transformer-based Models

Figure 3 for BERT WEAVER: Using WEight AVERaging to Enable Lifelong Learning for Transformer-based Models

Figure 4 for BERT WEAVER: Using WEight AVERaging to Enable Lifelong Learning for Transformer-based Models

Share this with someone who'll enjoy it:

Abstract:Recent developments in transfer learning have boosted the advancements in natural language processing tasks. The performance is, however, dependent on high-quality, manually annotated training data. Especially in the biomedical domain, it has been shown that one training corpus is not enough to learn generic models that are able to efficiently predict on new data. Therefore, state-of-the-art models need the ability of lifelong learning in order to improve performance as soon as new data are available - without the need of retraining the whole model from scratch. We present WEAVER, a simple, yet efficient post-processing method that infuses old knowledge into the new model, thereby reducing catastrophic forgetting. We show that applying WEAVER in a sequential manner results in similar word embedding distributions as doing a combined training on all data at once, while being computationally more efficient. Because there is no need of data sharing, the presented method is also easily applicable to federated learning settings and can for example be beneficial for the mining of electronic health records from different clinics.

View paper on

Share this with someone who'll enjoy it:

Title:BERT WEAVER: Using WEight AVERaging to Enable Lifelong Learning for Transformer-based Models

Paper and Code