Picture for Quentin Anthony

Quentin Anthony

DK

Accelerating Large Language Model Training with Hybrid GPU-based Compression

Add code
Sep 04, 2024
Viaarxiv icon

Demystifying the Communication Characteristics for Distributed Transformer Models

Add code
Aug 19, 2024
Viaarxiv icon

Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters

Add code
Aug 09, 2024
Figure 1 for Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
Figure 2 for Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
Figure 3 for Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
Figure 4 for Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
Viaarxiv icon

Zyda: A 1.3T Dataset for Open Language Modeling

Add code
Jun 04, 2024
Viaarxiv icon

Zamba: A Compact 7B SSM Hybrid Model

Add code
May 26, 2024
Viaarxiv icon

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

Add code
Apr 10, 2024
Figure 1 for Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Figure 2 for Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Figure 3 for Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Figure 4 for Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Viaarxiv icon

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Add code
Mar 26, 2024
Viaarxiv icon

BlackMamba: Mixture of Experts for State-Space Models

Add code
Feb 01, 2024
Viaarxiv icon

The Case for Co-Designing Model Architectures with Hardware

Add code
Jan 30, 2024
Figure 1 for The Case for Co-Designing Model Architectures with Hardware
Figure 2 for The Case for Co-Designing Model Architectures with Hardware
Figure 3 for The Case for Co-Designing Model Architectures with Hardware
Figure 4 for The Case for Co-Designing Model Architectures with Hardware
Viaarxiv icon

Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference

Add code
Jan 17, 2024
Viaarxiv icon