Picture for Margaret Mitchell

Margaret Mitchell

Constructing the CORD-19 Vaccine Dataset

Add code
Jul 26, 2024
Figure 1 for Constructing the CORD-19 Vaccine Dataset
Figure 2 for Constructing the CORD-19 Vaccine Dataset
Figure 3 for Constructing the CORD-19 Vaccine Dataset
Figure 4 for Constructing the CORD-19 Vaccine Dataset
Viaarxiv icon

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Add code
Jun 25, 2024
Viaarxiv icon

CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models

Add code
May 22, 2024
Figure 1 for CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models
Figure 2 for CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models
Figure 3 for CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models
Figure 4 for CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models
Viaarxiv icon

Evaluating the Social Impact of Generative AI Systems in Systems and Society

Add code
Jun 12, 2023
Viaarxiv icon

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

Add code
Mar 07, 2023
Figure 1 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 2 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 3 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 4 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Viaarxiv icon

Measuring Data

Add code
Dec 09, 2022
Viaarxiv icon

The Stack: 3 TB of permissively licensed source code

Add code
Nov 20, 2022
Viaarxiv icon

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Add code
Nov 09, 2022
Viaarxiv icon

A Human Rights-Based Approach to Responsible AI

Add code
Oct 06, 2022
Figure 1 for A Human Rights-Based Approach to Responsible AI
Viaarxiv icon

Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements

Add code
Oct 06, 2022
Figure 1 for Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements
Figure 2 for Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements
Figure 3 for Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements
Viaarxiv icon