Picture for Yuning Mao

Yuning Mao

Jack

Improving Model Factuality with Fine-grained Critique-based Evaluator

Add code
Oct 24, 2024
Viaarxiv icon

The Llama 3 Herd of Models

Add code
Jul 31, 2024
Viaarxiv icon

Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

Add code
Feb 26, 2024
Viaarxiv icon

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

Add code
Dec 07, 2023
Figure 1 for Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Figure 2 for Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Figure 3 for Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Figure 4 for Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Viaarxiv icon

RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training

Add code
Dec 07, 2023
Viaarxiv icon

MART: Improving LLM Safety with Multi-round Automatic Red-Teaming

Add code
Nov 13, 2023
Viaarxiv icon

Llama 2: Open Foundation and Fine-Tuned Chat Models

Add code
Jul 19, 2023
Figure 1 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Figure 2 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Figure 3 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Figure 4 for Llama 2: Open Foundation and Fine-Tuned Chat Models
Viaarxiv icon

LIMA: Less Is More for Alignment

Add code
May 18, 2023
Viaarxiv icon

Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization

Add code
May 06, 2023
Viaarxiv icon

Representation Deficiency in Masked Language Modeling

Add code
Feb 04, 2023
Viaarxiv icon