Picture for Benoît Sagot

Benoît Sagot

ALMAnaCH

Diachronic Document Dataset for Semantic Layout Analysis

Add code
Nov 15, 2024
Figure 1 for Diachronic Document Dataset for Semantic Layout Analysis
Figure 2 for Diachronic Document Dataset for Semantic Layout Analysis
Figure 3 for Diachronic Document Dataset for Semantic Layout Analysis
Figure 4 for Diachronic Document Dataset for Semantic Layout Analysis
Viaarxiv icon

CamemBERT 2.0: A Smarter French Language Model Aged to Perfection

Add code
Nov 13, 2024
Viaarxiv icon

Tree of Problems: Improving structured problem solving with compositionality

Add code
Oct 09, 2024
Figure 1 for Tree of Problems: Improving structured problem solving with compositionality
Figure 2 for Tree of Problems: Improving structured problem solving with compositionality
Figure 3 for Tree of Problems: Improving structured problem solving with compositionality
Figure 4 for Tree of Problems: Improving structured problem solving with compositionality
Viaarxiv icon

Molyé: A Corpus-based Approach to Language Contact in Colonial France

Add code
Aug 08, 2024
Viaarxiv icon

In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation

Add code
Aug 01, 2024
Viaarxiv icon

Towards Zero-Shot Multimodal Machine Translation

Add code
Jul 18, 2024
Viaarxiv icon

mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus

Add code
Jun 13, 2024
Figure 1 for mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus
Figure 2 for mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus
Figure 3 for mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus
Figure 4 for mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus
Viaarxiv icon

PatentEval: Understanding Errors in Patent Generation

Add code
Jun 05, 2024
Viaarxiv icon

Why do small language models underperform? Studying Language Model Saturation via the Softmax Bottleneck

Add code
Apr 11, 2024
Viaarxiv icon

Making Sentence Embeddings Robust to User-Generated Content

Add code
Mar 25, 2024
Viaarxiv icon