Picture for Mohammad Shoeybi

Mohammad Shoeybi

Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset

Add code
Dec 03, 2024
Viaarxiv icon

MM-Embed: Universal Multimodal Retrieval with Multimodal LLMs

Add code
Nov 04, 2024
Viaarxiv icon

MIND: Math Informed syNthetic Dialogues for Pretraining LLMs

Add code
Oct 15, 2024
Figure 1 for MIND: Math Informed syNthetic Dialogues for Pretraining LLMs
Figure 2 for MIND: Math Informed syNthetic Dialogues for Pretraining LLMs
Figure 3 for MIND: Math Informed syNthetic Dialogues for Pretraining LLMs
Figure 4 for MIND: Math Informed syNthetic Dialogues for Pretraining LLMs
Viaarxiv icon

Upcycling Large Language Models into Mixture of Experts

Add code
Oct 10, 2024
Viaarxiv icon

NVLM: Open Frontier-Class Multimodal LLMs

Add code
Sep 17, 2024
Figure 1 for NVLM: Open Frontier-Class Multimodal LLMs
Figure 2 for NVLM: Open Frontier-Class Multimodal LLMs
Figure 3 for NVLM: Open Frontier-Class Multimodal LLMs
Figure 4 for NVLM: Open Frontier-Class Multimodal LLMs
Viaarxiv icon

LLM Pruning and Distillation in Practice: The Minitron Approach

Add code
Aug 21, 2024
Figure 1 for LLM Pruning and Distillation in Practice: The Minitron Approach
Figure 2 for LLM Pruning and Distillation in Practice: The Minitron Approach
Figure 3 for LLM Pruning and Distillation in Practice: The Minitron Approach
Figure 4 for LLM Pruning and Distillation in Practice: The Minitron Approach
Viaarxiv icon

ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

Add code
Jul 19, 2024
Viaarxiv icon

Compact Language Models via Pruning and Knowledge Distillation

Add code
Jul 19, 2024
Figure 1 for Compact Language Models via Pruning and Knowledge Distillation
Figure 2 for Compact Language Models via Pruning and Knowledge Distillation
Figure 3 for Compact Language Models via Pruning and Knowledge Distillation
Figure 4 for Compact Language Models via Pruning and Knowledge Distillation
Viaarxiv icon

Reuse, Don't Retrain: A Recipe for Continued Pretraining of Language Models

Add code
Jul 09, 2024
Viaarxiv icon

Data, Data Everywhere: A Guide for Pretraining Dataset Construction

Add code
Jul 08, 2024
Viaarxiv icon