Picture for Christopher Akiki

Christopher Akiki

StarCoder 2 and The Stack v2: The Next Generation

Add code
Feb 29, 2024
Figure 1 for StarCoder 2 and The Stack v2: The Next Generation
Figure 2 for StarCoder 2 and The Stack v2: The Next Generation
Figure 3 for StarCoder 2 and The Stack v2: The Next Generation
Figure 4 for StarCoder 2 and The Stack v2: The Next Generation
Viaarxiv icon

GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration

Add code
Jun 02, 2023
Viaarxiv icon

StarCoder: may the source be with you!

Add code
May 09, 2023
Viaarxiv icon

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

Add code
Mar 07, 2023
Figure 1 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 2 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 3 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 4 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Viaarxiv icon

Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face

Add code
Feb 28, 2023
Viaarxiv icon

The ROOTS Search Tool: Data Transparency for LLMs

Add code
Feb 27, 2023
Figure 1 for The ROOTS Search Tool: Data Transparency for LLMs
Figure 2 for The ROOTS Search Tool: Data Transparency for LLMs
Figure 3 for The ROOTS Search Tool: Data Transparency for LLMs
Viaarxiv icon

SantaCoder: don't reach for the stars!

Add code
Jan 09, 2023
Viaarxiv icon

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Add code
Nov 09, 2022
Viaarxiv icon

How Train-Test Leakage Affects Zero-shot Retrieval

Add code
Jun 29, 2022
Figure 1 for How Train-Test Leakage Affects Zero-shot Retrieval
Figure 2 for How Train-Test Leakage Affects Zero-shot Retrieval
Figure 3 for How Train-Test Leakage Affects Zero-shot Retrieval
Figure 4 for How Train-Test Leakage Affects Zero-shot Retrieval
Viaarxiv icon

Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0

Add code
Apr 11, 2022
Figure 1 for Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0
Figure 2 for Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0
Figure 3 for Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0
Figure 4 for Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0
Viaarxiv icon