Picture for Olatunji Ruwase

Olatunji Ruwase

Mojito: Motion Trajectory and Intensity Control for Video Generation

Add code
Dec 12, 2024
Viaarxiv icon

Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer

Add code
Aug 30, 2024
Figure 1 for Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer
Figure 2 for Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer
Figure 3 for Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer
Figure 4 for Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer
Viaarxiv icon

Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training

Add code
Jun 27, 2024
Figure 1 for Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training
Figure 2 for Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training
Figure 3 for Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training
Figure 4 for Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training
Viaarxiv icon

FastPersist: Accelerating Model Checkpointing in Deep Learning

Add code
Jun 19, 2024
Viaarxiv icon

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Add code
Apr 23, 2024
Figure 1 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Figure 2 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Figure 3 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Figure 4 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Viaarxiv icon

Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding

Add code
Mar 05, 2024
Viaarxiv icon

FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design

Add code
Jan 25, 2024
Viaarxiv icon

ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

Add code
Dec 18, 2023
Figure 1 for ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Figure 2 for ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Figure 3 for ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Figure 4 for ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Viaarxiv icon

DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention

Add code
Sep 29, 2023
Figure 1 for DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention
Figure 2 for DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention
Figure 3 for DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention
Figure 4 for DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention
Viaarxiv icon

DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales

Add code
Aug 02, 2023
Figure 1 for DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales
Figure 2 for DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales
Figure 3 for DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales
Figure 4 for DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales
Viaarxiv icon