Picture for Huu Nguyen

Huu Nguyen

RedPajama: an Open Dataset for Training Large Language Models

Add code
Nov 19, 2024
Viaarxiv icon

ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming

Add code
Apr 06, 2024
Figure 1 for ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming
Figure 2 for ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming
Figure 3 for ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming
Figure 4 for ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming
Viaarxiv icon

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Add code
Mar 30, 2024
Figure 1 for Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Figure 2 for Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Figure 3 for Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Figure 4 for Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Viaarxiv icon

OpenAssistant Conversations -- Democratizing Large Language Model Alignment

Add code
Apr 14, 2023
Figure 1 for OpenAssistant Conversations -- Democratizing Large Language Model Alignment
Figure 2 for OpenAssistant Conversations -- Democratizing Large Language Model Alignment
Figure 3 for OpenAssistant Conversations -- Democratizing Large Language Model Alignment
Figure 4 for OpenAssistant Conversations -- Democratizing Large Language Model Alignment
Viaarxiv icon

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset

Add code
Mar 07, 2023
Figure 1 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 2 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 3 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Figure 4 for The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Viaarxiv icon

SantaCoder: don't reach for the stars!

Add code
Jan 09, 2023
Viaarxiv icon

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Add code
Nov 09, 2022
Viaarxiv icon