Picture for Aitor Soroa

Aitor Soroa

Conditioning LLMs to Generate Code-Switched Text: A Methodology Grounded in Naturally Occurring Data

Add code
Feb 18, 2025
Viaarxiv icon

EuskañolDS: A Naturally Sourced Corpus for Basque-Spanish Code-Switching

Add code
Feb 05, 2025
Viaarxiv icon

A LLM-Based Ranking Method for the Evaluation of Automatic Counter-Narrative Generation

Add code
Jun 21, 2024
Viaarxiv icon

BertaQA: How Much Do Language Models Know About Local Culture?

Add code
Jun 11, 2024
Figure 1 for BertaQA: How Much Do Language Models Know About Local Culture?
Figure 2 for BertaQA: How Much Do Language Models Know About Local Culture?
Figure 3 for BertaQA: How Much Do Language Models Know About Local Culture?
Figure 4 for BertaQA: How Much Do Language Models Know About Local Culture?
Viaarxiv icon

XNLIeu: a dataset for cross-lingual NLI in Basque

Add code
Apr 10, 2024
Figure 1 for XNLIeu: a dataset for cross-lingual NLI in Basque
Figure 2 for XNLIeu: a dataset for cross-lingual NLI in Basque
Figure 3 for XNLIeu: a dataset for cross-lingual NLI in Basque
Figure 4 for XNLIeu: a dataset for cross-lingual NLI in Basque
Viaarxiv icon

Latxa: An Open Language Model and Evaluation Suite for Basque

Add code
Mar 29, 2024
Figure 1 for Latxa: An Open Language Model and Evaluation Suite for Basque
Figure 2 for Latxa: An Open Language Model and Evaluation Suite for Basque
Figure 3 for Latxa: An Open Language Model and Evaluation Suite for Basque
Figure 4 for Latxa: An Open Language Model and Evaluation Suite for Basque
Viaarxiv icon

Improving Explicit Spatial Relationships in Text-to-Image Generation through an Automatically Derived Dataset

Add code
Mar 01, 2024
Viaarxiv icon

Do Multilingual Language Models Think Better in English?

Add code
Aug 02, 2023
Figure 1 for Do Multilingual Language Models Think Better in English?
Figure 2 for Do Multilingual Language Models Think Better in English?
Figure 3 for Do Multilingual Language Models Think Better in English?
Figure 4 for Do Multilingual Language Models Think Better in English?
Viaarxiv icon

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

Add code
Mar 07, 2023
Figure 1 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 2 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 3 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 4 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Viaarxiv icon

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Add code
Nov 09, 2022
Viaarxiv icon