Picture for Amir Hossein Kargaran

Amir Hossein Kargaran

GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages

Add code
Oct 31, 2024
Figure 1 for GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
Figure 2 for GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
Figure 3 for GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
Figure 4 for GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
Viaarxiv icon

MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment

Add code
Oct 08, 2024
Viaarxiv icon

How Transliterations Improve Crosslingual Alignment

Add code
Sep 25, 2024
Figure 1 for How Transliterations Improve Crosslingual Alignment
Figure 2 for How Transliterations Improve Crosslingual Alignment
Figure 3 for How Transliterations Improve Crosslingual Alignment
Figure 4 for How Transliterations Improve Crosslingual Alignment
Viaarxiv icon

MaskLID: Code-Switching Language Identification through Iterative Masking

Add code
Jun 10, 2024
Viaarxiv icon

GIRT-Model: Automated Generation of Issue Report Templates

Add code
Feb 08, 2024
Figure 1 for GIRT-Model: Automated Generation of Issue Report Templates
Figure 2 for GIRT-Model: Automated Generation of Issue Report Templates
Figure 3 for GIRT-Model: Automated Generation of Issue Report Templates
Figure 4 for GIRT-Model: Automated Generation of Issue Report Templates
Viaarxiv icon

GlotLID: Language Identification for Low-Resource Languages

Add code
Nov 04, 2023
Figure 1 for GlotLID: Language Identification for Low-Resource Languages
Figure 2 for GlotLID: Language Identification for Low-Resource Languages
Figure 3 for GlotLID: Language Identification for Low-Resource Languages
Figure 4 for GlotLID: Language Identification for Low-Resource Languages
Viaarxiv icon

GlotScript: A Resource and Tool for Low Resource Writing System Identification

Add code
Sep 23, 2023
Viaarxiv icon

Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages

Add code
May 26, 2023
Viaarxiv icon

MenuCraft: Interactive Menu System Design with Large Language Models

Add code
Mar 08, 2023
Viaarxiv icon

On Detecting Hidden Third-Party Web Trackers with a Wide Dependency Chain Graph: A Representation Learning Approach

Add code
Apr 29, 2020
Figure 1 for On Detecting Hidden Third-Party Web Trackers with a Wide Dependency Chain Graph: A Representation Learning Approach
Figure 2 for On Detecting Hidden Third-Party Web Trackers with a Wide Dependency Chain Graph: A Representation Learning Approach
Figure 3 for On Detecting Hidden Third-Party Web Trackers with a Wide Dependency Chain Graph: A Representation Learning Approach
Figure 4 for On Detecting Hidden Third-Party Web Trackers with a Wide Dependency Chain Graph: A Representation Learning Approach
Viaarxiv icon