Picture for François Yvon

François Yvon

TLP

GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages

Add code
Oct 31, 2024
Viaarxiv icon

MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment

Add code
Oct 08, 2024
Viaarxiv icon

How Transliterations Improve Crosslingual Alignment

Add code
Sep 25, 2024
Viaarxiv icon

Zero-Shot Machine-Generated Text Detection Using Mixture of Large Language Models

Add code
Sep 11, 2024
Viaarxiv icon

MaskLID: Code-Switching Language Identification through Iterative Masking

Add code
Jun 10, 2024
Viaarxiv icon

Optimizing example selection for retrieval-augmented machine translation with translation memories

Add code
May 23, 2024
Viaarxiv icon

Lessons from the Trenches on Reproducible Evaluation of Language Models

Add code
May 23, 2024
Viaarxiv icon

CroissantLLM: A Truly Bilingual French-English Language Model

Add code
Feb 02, 2024
Figure 1 for CroissantLLM: A Truly Bilingual French-English Language Model
Figure 2 for CroissantLLM: A Truly Bilingual French-English Language Model
Figure 3 for CroissantLLM: A Truly Bilingual French-English Language Model
Figure 4 for CroissantLLM: A Truly Bilingual French-English Language Model
Viaarxiv icon

GlotLID: Language Identification for Low-Resource Languages

Add code
Nov 04, 2023
Viaarxiv icon

Structural generalization in COGS: Supertagging is (almost) all you need

Add code
Oct 21, 2023
Viaarxiv icon