Picture for Hila Gonen

Hila Gonen

Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models

Add code
Aug 12, 2024
Viaarxiv icon

MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization

Add code
Jul 11, 2024
Figure 1 for MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization
Figure 2 for MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization
Figure 3 for MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization
Figure 4 for MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization
Viaarxiv icon

Voices Unheard: NLP Resources and Models for Yorùbá Regional Dialects

Add code
Jun 27, 2024
Viaarxiv icon

MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling

Add code
Mar 15, 2024
Viaarxiv icon

Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models

Add code
Jan 19, 2024
Viaarxiv icon

Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark

Add code
Nov 15, 2023
Viaarxiv icon

That was the last straw, we need more: Are Translation Systems Sensitive to Disambiguating Context?

Add code
Oct 23, 2023
Viaarxiv icon

BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer

Add code
May 24, 2023
Viaarxiv icon

Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models

Add code
May 23, 2023
Viaarxiv icon

Dictionary-based Phrase-level Prompting of Large Language Models for Machine Translation

Add code
Feb 15, 2023
Viaarxiv icon