Picture for Tim Dettmers

Tim Dettmers

OLMoE: Open Mixture-of-Experts Language Models

Add code
Sep 03, 2024
Figure 1 for OLMoE: Open Mixture-of-Experts Language Models
Figure 2 for OLMoE: Open Mixture-of-Experts Language Models
Figure 3 for OLMoE: Open Mixture-of-Experts Language Models
Figure 4 for OLMoE: Open Mixture-of-Experts Language Models
Viaarxiv icon

Distributed Inference and Fine-tuning of Large Language Models Over The Internet

Add code
Dec 13, 2023
Viaarxiv icon

MatFormer: Nested Transformer for Elastic Inference

Add code
Oct 11, 2023
Viaarxiv icon

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

Add code
Jun 05, 2023
Viaarxiv icon

Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model

Add code
May 23, 2023
Figure 1 for Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
Figure 2 for Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
Figure 3 for Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
Figure 4 for Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
Viaarxiv icon

QLoRA: Efficient Finetuning of Quantized LLMs

Add code
May 23, 2023
Viaarxiv icon

Stable and low-precision training for large-scale vision-language models

Add code
Apr 25, 2023
Figure 1 for Stable and low-precision training for large-scale vision-language models
Figure 2 for Stable and low-precision training for large-scale vision-language models
Figure 3 for Stable and low-precision training for large-scale vision-language models
Figure 4 for Stable and low-precision training for large-scale vision-language models
Viaarxiv icon

SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient

Add code
Jan 27, 2023
Viaarxiv icon

The case for 4-bit precision: k-bit Inference Scaling Laws

Add code
Dec 19, 2022
Viaarxiv icon

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Add code
Nov 09, 2022
Viaarxiv icon