Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Danoliteracy of Generative, Large Language Models

Oct 30, 2024

Søren Vejlgaard Holm, Lars Kai Hansen, Martin Carsten Nielsen

Figure 1 for Danoliteracy of Generative, Large Language Models

Figure 2 for Danoliteracy of Generative, Large Language Models

Figure 3 for Danoliteracy of Generative, Large Language Models

Figure 4 for Danoliteracy of Generative, Large Language Models

Share this with someone who'll enjoy it:

Abstract:The language technology moonshot moment of Generative, Large Language Models (GLLMs) was not limited to English: These models brought a surge of technological applications, investments and hype to low-resource languages as well. However, the capabilities of these models in languages such as Danish were until recently difficult to verify beyond qualitative demonstrations due to a lack of applicable evaluation corpora. We present a GLLM benchmark to evaluate Danoliteracy, a measure of Danish language and cultural competency, across eight diverse scenarios such Danish citizenship tests and abstractive social media question answering. This limited-size benchmark is found to produce a robust ranking that correlates to human feedback at $\rho \sim 0.8$ with GPT-4 and Claude Opus models achieving the highest rankings. Analyzing these model results across scenarios, we find one strong underlying factor explaining $95\%$ of scenario performance variance for GLLMs in Danish, suggesting a $g$ factor of model consistency in language adaption.

* 16 pages, 13 figures, submitted to: NoDaLiDa/Baltic-HLT 2025

View paper on

Share this with someone who'll enjoy it:

Title:Danoliteracy of Generative, Large Language Models

Paper and Code