Picture for Chunlan Ma

Chunlan Ma

LangSAMP: Language-Script Aware Multilingual Pretraining

Add code
Sep 26, 2024
Viaarxiv icon

How Transliterations Improve Crosslingual Alignment

Add code
Sep 25, 2024
Viaarxiv icon

Exploring the Role of Transliteration in In-Context Learning for Low-resource Languages Written in Non-Latin Scripts

Add code
Jul 02, 2024
Viaarxiv icon

TransMI: A Framework to Create Strong Baselines from Multilingual Pretrained Language Models for Transliterated Data

Add code
May 16, 2024
Viaarxiv icon

TransliCo: A Contrastive Learning Framework to Address the Script Barrier in Multilingual Pretrained Language Models

Add code
Jan 12, 2024
Viaarxiv icon

MoSECroT: Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer

Add code
Jan 09, 2024
Viaarxiv icon

Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages

Add code
May 26, 2023
Viaarxiv icon

Taxi1500: A Multilingual Dataset for Text Classification in 1500 Languages

Add code
May 15, 2023
Viaarxiv icon