Picture for Fajri Koto

Fajri Koto

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia

Add code
Mar 10, 2025
Viaarxiv icon

Llama-3.1-Sherkala-8B-Chat: An Open Large Language Model for Kazakh

Add code
Mar 03, 2025
Viaarxiv icon

Unveiling Cultural Blind Spots: Analyzing the Limitations of mLLMs in Procedural Text Comprehension

Add code
Feb 20, 2025
Viaarxiv icon

Instruction Tuning on Public Government and Cultural Data for Low-Resource Language: a Case Study in Kazakh

Add code
Feb 19, 2025
Viaarxiv icon

Qorgau: Evaluating LLM Safety in Kazakh-Russian Bilingual Contexts

Add code
Feb 19, 2025
Viaarxiv icon

Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs

Add code
Feb 18, 2025
Viaarxiv icon

Commonsense Reasoning in Arab Culture

Add code
Feb 18, 2025
Viaarxiv icon

Synthetic Data Generation for Culturally Nuanced Commonsense Reasoning in Low-Resource Languages

Add code
Feb 18, 2025
Viaarxiv icon

KazMMLU: Evaluating Language Models on Kazakh, Russian, and Regional Knowledge of Kazakhstan

Add code
Feb 18, 2025
Viaarxiv icon

LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch

Add code
Jan 16, 2025
Viaarxiv icon