Picture for Yian Zhang

Yian Zhang

Language model developers should report train-test overlap

Add code
Oct 10, 2024
Viaarxiv icon

Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models

Add code
Jul 31, 2024
Viaarxiv icon

Nemotron-4 340B Technical Report

Add code
Jun 17, 2024
Figure 1 for Nemotron-4 340B Technical Report
Figure 2 for Nemotron-4 340B Technical Report
Figure 3 for Nemotron-4 340B Technical Report
Figure 4 for Nemotron-4 340B Technical Report
Viaarxiv icon

Holistic Evaluation of Language Models

Add code
Nov 16, 2022
Figure 1 for Holistic Evaluation of Language Models
Figure 2 for Holistic Evaluation of Language Models
Figure 3 for Holistic Evaluation of Language Models
Figure 4 for Holistic Evaluation of Language Models
Viaarxiv icon

When Do You Need Billions of Words of Pretraining Data?

Add code
Nov 10, 2020
Figure 1 for When Do You Need Billions of Words of Pretraining Data?
Figure 2 for When Do You Need Billions of Words of Pretraining Data?
Figure 3 for When Do You Need Billions of Words of Pretraining Data?
Figure 4 for When Do You Need Billions of Words of Pretraining Data?
Viaarxiv icon

Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually)

Add code
Oct 11, 2020
Figure 1 for Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually)
Figure 2 for Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually)
Figure 3 for Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually)
Figure 4 for Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually)
Viaarxiv icon

Latent Tree Learning with Ordered Neurons: What Parses Does It Produce?

Add code
Oct 10, 2020
Figure 1 for Latent Tree Learning with Ordered Neurons: What Parses Does It Produce?
Figure 2 for Latent Tree Learning with Ordered Neurons: What Parses Does It Produce?
Figure 3 for Latent Tree Learning with Ordered Neurons: What Parses Does It Produce?
Figure 4 for Latent Tree Learning with Ordered Neurons: What Parses Does It Produce?
Viaarxiv icon