Picture for Daniel Simig

Daniel Simig

D4: Improving LLM Pretraining via Document De-Duplication and Diversification

Add code
Aug 23, 2023
Figure 1 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Figure 2 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Figure 3 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Figure 4 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Viaarxiv icon

Understanding In-Context Learning via Supportive Pretraining Data

Add code
Jun 26, 2023
Viaarxiv icon

Evaluating end-to-end entity linking on domain-specific knowledge bases: Learning about ancient technologies from museum collections

Add code
May 23, 2023
Figure 1 for Evaluating end-to-end entity linking on domain-specific knowledge bases: Learning about ancient technologies from museum collections
Figure 2 for Evaluating end-to-end entity linking on domain-specific knowledge bases: Learning about ancient technologies from museum collections
Figure 3 for Evaluating end-to-end entity linking on domain-specific knowledge bases: Learning about ancient technologies from museum collections
Figure 4 for Evaluating end-to-end entity linking on domain-specific knowledge bases: Learning about ancient technologies from museum collections
Viaarxiv icon

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

Add code
Dec 28, 2022
Viaarxiv icon

Text Characterization Toolkit

Add code
Oct 04, 2022
Figure 1 for Text Characterization Toolkit
Figure 2 for Text Characterization Toolkit
Figure 3 for Text Characterization Toolkit
Figure 4 for Text Characterization Toolkit
Viaarxiv icon

Open Vocabulary Extreme Classification Using Generative Models

Add code
May 12, 2022
Figure 1 for Open Vocabulary Extreme Classification Using Generative Models
Figure 2 for Open Vocabulary Extreme Classification Using Generative Models
Figure 3 for Open Vocabulary Extreme Classification Using Generative Models
Figure 4 for Open Vocabulary Extreme Classification Using Generative Models
Viaarxiv icon

OPT: Open Pre-trained Transformer Language Models

Add code
May 05, 2022
Figure 1 for OPT: Open Pre-trained Transformer Language Models
Figure 2 for OPT: Open Pre-trained Transformer Language Models
Figure 3 for OPT: Open Pre-trained Transformer Language Models
Figure 4 for OPT: Open Pre-trained Transformer Language Models
Viaarxiv icon

Few-shot Learning with Multilingual Language Models

Add code
Dec 20, 2021
Figure 1 for Few-shot Learning with Multilingual Language Models
Figure 2 for Few-shot Learning with Multilingual Language Models
Figure 3 for Few-shot Learning with Multilingual Language Models
Figure 4 for Few-shot Learning with Multilingual Language Models
Viaarxiv icon