Picture for Magnus Sahlgren

Magnus Sahlgren

SWEb: A Large Web Dataset for the Scandinavian Languages

Add code
Oct 06, 2024
Viaarxiv icon

GPT-SW3: An Autoregressive Language Model for the Nordic Languages

Add code
May 23, 2023
Viaarxiv icon

The Nordic Pile: A 1.2TB Nordic Dataset for Language Modeling

Add code
Mar 30, 2023
Viaarxiv icon

We Need to Talk About Data: The Importance of Data Readiness in Natural Language Processing

Add code
Oct 11, 2021
Figure 1 for We Need to Talk About Data: The Importance of Data Readiness in Natural Language Processing
Figure 2 for We Need to Talk About Data: The Importance of Data Readiness in Natural Language Processing
Figure 3 for We Need to Talk About Data: The Importance of Data Readiness in Natural Language Processing
Figure 4 for We Need to Talk About Data: The Importance of Data Readiness in Natural Language Processing
Viaarxiv icon

Cross-lingual Transfer of Monolingual Models

Add code
Sep 15, 2021
Figure 1 for Cross-lingual Transfer of Monolingual Models
Figure 2 for Cross-lingual Transfer of Monolingual Models
Figure 3 for Cross-lingual Transfer of Monolingual Models
Figure 4 for Cross-lingual Transfer of Monolingual Models
Viaarxiv icon

A comprehensive comparative evaluation and analysis of Distributional Semantic Models

Add code
May 20, 2021
Figure 1 for A comprehensive comparative evaluation and analysis of Distributional Semantic Models
Figure 2 for A comprehensive comparative evaluation and analysis of Distributional Semantic Models
Figure 3 for A comprehensive comparative evaluation and analysis of Distributional Semantic Models
Figure 4 for A comprehensive comparative evaluation and analysis of Distributional Semantic Models
Viaarxiv icon

Should we Stop Training More Monolingual Models, and Simply Use Machine Translation Instead?

Add code
Apr 21, 2021
Figure 1 for Should we Stop Training More Monolingual Models, and Simply Use Machine Translation Instead?
Figure 2 for Should we Stop Training More Monolingual Models, and Simply Use Machine Translation Instead?
Figure 3 for Should we Stop Training More Monolingual Models, and Simply Use Machine Translation Instead?
Figure 4 for Should we Stop Training More Monolingual Models, and Simply Use Machine Translation Instead?
Viaarxiv icon

Federated Word2Vec: Leveraging Federated Learning to Encourage Collaborative Representation Learning

Add code
Apr 19, 2021
Figure 1 for Federated Word2Vec: Leveraging Federated Learning to Encourage Collaborative Representation Learning
Figure 2 for Federated Word2Vec: Leveraging Federated Learning to Encourage Collaborative Representation Learning
Figure 3 for Federated Word2Vec: Leveraging Federated Learning to Encourage Collaborative Representation Learning
Figure 4 for Federated Word2Vec: Leveraging Federated Learning to Encourage Collaborative Representation Learning
Viaarxiv icon

The Singleton Fallacy: Why Current Critiques of Language Models Miss the Point

Add code
Feb 08, 2021
Viaarxiv icon

Data Readiness for Natural Language Processing

Add code
Sep 30, 2020
Figure 1 for Data Readiness for Natural Language Processing
Viaarxiv icon