Picture for François Yvon

François Yvon

Lessons from the Trenches on Reproducible Evaluation of Language Models

Add code
May 23, 2024
Viaarxiv icon

Optimizing example selection for retrieval-augmented machine translation with translation memories

Add code
May 23, 2024
Viaarxiv icon

CroissantLLM: A Truly Bilingual French-English Language Model

Add code
Feb 02, 2024
Viaarxiv icon

GlotLID: Language Identification for Low-Resource Languages

Add code
Nov 04, 2023
Figure 1 for GlotLID: Language Identification for Low-Resource Languages
Figure 2 for GlotLID: Language Identification for Low-Resource Languages
Figure 3 for GlotLID: Language Identification for Low-Resource Languages
Figure 4 for GlotLID: Language Identification for Low-Resource Languages
Viaarxiv icon

Structural generalization in COGS: Supertagging is (almost) all you need

Add code
Oct 21, 2023
Figure 1 for Structural generalization in COGS: Supertagging is (almost) all you need
Figure 2 for Structural generalization in COGS: Supertagging is (almost) all you need
Figure 3 for Structural generalization in COGS: Supertagging is (almost) all you need
Figure 4 for Structural generalization in COGS: Supertagging is (almost) all you need
Viaarxiv icon

Towards Example-Based NMT with Multi-Levenshtein Transformers

Add code
Oct 13, 2023
Viaarxiv icon

GlotScript: A Resource and Tool for Low Resource Writing System Identification

Add code
Sep 23, 2023
Figure 1 for GlotScript: A Resource and Tool for Low Resource Writing System Identification
Figure 2 for GlotScript: A Resource and Tool for Low Resource Writing System Identification
Figure 3 for GlotScript: A Resource and Tool for Low Resource Writing System Identification
Figure 4 for GlotScript: A Resource and Tool for Low Resource Writing System Identification
Viaarxiv icon

BiSync: A Bilingual Editor for Synchronized Monolingual Texts

Add code
Jun 01, 2023
Figure 1 for BiSync: A Bilingual Editor for Synchronized Monolingual Texts
Figure 2 for BiSync: A Bilingual Editor for Synchronized Monolingual Texts
Figure 3 for BiSync: A Bilingual Editor for Synchronized Monolingual Texts
Figure 4 for BiSync: A Bilingual Editor for Synchronized Monolingual Texts
Viaarxiv icon

Assessing Word Importance Using Models Trained for Semantic Tasks

Add code
May 31, 2023
Figure 1 for Assessing Word Importance Using Models Trained for Semantic Tasks
Figure 2 for Assessing Word Importance Using Models Trained for Semantic Tasks
Figure 3 for Assessing Word Importance Using Models Trained for Semantic Tasks
Figure 4 for Assessing Word Importance Using Models Trained for Semantic Tasks
Viaarxiv icon

Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages

Add code
May 26, 2023
Figure 1 for Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
Figure 2 for Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
Figure 3 for Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
Figure 4 for Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
Viaarxiv icon