Picture for Marta R. Costa-jussà

Marta R. Costa-jussà

NLLB Team

LCFO: Long Context and Long Form Output Dataset and Benchmarking

Add code
Dec 12, 2024
Viaarxiv icon

2M-BELEBELE: Highly Multilingual Speech and American Sign Language Comprehension Dataset

Add code
Dec 11, 2024
Viaarxiv icon

Large Concept Models: Language Modeling in a Sentence Representation Space

Add code
Dec 11, 2024
Figure 1 for Large Concept Models: Language Modeling in a Sentence Representation Space
Figure 2 for Large Concept Models: Language Modeling in a Sentence Representation Space
Figure 3 for Large Concept Models: Language Modeling in a Sentence Representation Space
Figure 4 for Large Concept Models: Language Modeling in a Sentence Representation Space
Viaarxiv icon

Y-NQ: English-Yorùbá Evaluation dataset for Open-Book Reading Comprehension and Text Generation

Add code
Dec 11, 2024
Viaarxiv icon

On the Role of Speech Data in Reducing Toxicity Detection Bias

Add code
Nov 12, 2024
Figure 1 for On the Role of Speech Data in Reducing Toxicity Detection Bias
Figure 2 for On the Role of Speech Data in Reducing Toxicity Detection Bias
Figure 3 for On the Role of Speech Data in Reducing Toxicity Detection Bias
Figure 4 for On the Role of Speech Data in Reducing Toxicity Detection Bias
Viaarxiv icon

On the Similarity of Circuits across Languages: a Case Study on the Subject-verb Agreement Task

Add code
Oct 09, 2024
Figure 1 for On the Similarity of Circuits across Languages: a Case Study on the Subject-verb Agreement Task
Figure 2 for On the Similarity of Circuits across Languages: a Case Study on the Subject-verb Agreement Task
Figure 3 for On the Similarity of Circuits across Languages: a Case Study on the Subject-verb Agreement Task
Figure 4 for On the Similarity of Circuits across Languages: a Case Study on the Subject-verb Agreement Task
Viaarxiv icon

Unveiling the Role of Pretraining in Direct Speech Translation

Add code
Sep 26, 2024
Figure 1 for Unveiling the Role of Pretraining in Direct Speech Translation
Figure 2 for Unveiling the Role of Pretraining in Direct Speech Translation
Figure 3 for Unveiling the Role of Pretraining in Direct Speech Translation
Figure 4 for Unveiling the Role of Pretraining in Direct Speech Translation
Viaarxiv icon

Linguini: A benchmark for language-agnostic linguistic reasoning

Add code
Sep 18, 2024
Viaarxiv icon

Towards Massive Multilingual Holistic Bias

Add code
Jun 29, 2024
Figure 1 for Towards Massive Multilingual Holistic Bias
Figure 2 for Towards Massive Multilingual Holistic Bias
Figure 3 for Towards Massive Multilingual Holistic Bias
Figure 4 for Towards Massive Multilingual Holistic Bias
Viaarxiv icon

A Primer on the Inner Workings of Transformer-based Language Models

Add code
May 02, 2024
Viaarxiv icon