Picture for Mostafa Mahmoud

Mostafa Mahmoud

University of Toronto

Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training

Add code
Apr 28, 2022
Figure 1 for Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training
Figure 2 for Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training
Figure 3 for Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training
Figure 4 for Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training
Viaarxiv icon

Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models

Add code
Mar 23, 2022
Figure 1 for Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models
Figure 2 for Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models
Figure 3 for Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models
Figure 4 for Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models
Viaarxiv icon

APack: Off-Chip, Lossless Data Compression for Efficient Deep Learning Inference

Add code
Jan 21, 2022
Figure 1 for APack: Off-Chip, Lossless Data Compression for Efficient Deep Learning Inference
Figure 2 for APack: Off-Chip, Lossless Data Compression for Efficient Deep Learning Inference
Figure 3 for APack: Off-Chip, Lossless Data Compression for Efficient Deep Learning Inference
Figure 4 for APack: Off-Chip, Lossless Data Compression for Efficient Deep Learning Inference
Viaarxiv icon

FPRaker: A Processing Element For Accelerating Neural Network Training

Add code
Oct 15, 2020
Figure 1 for FPRaker: A Processing Element For Accelerating Neural Network Training
Figure 2 for FPRaker: A Processing Element For Accelerating Neural Network Training
Figure 3 for FPRaker: A Processing Element For Accelerating Neural Network Training
Figure 4 for FPRaker: A Processing Element For Accelerating Neural Network Training
Viaarxiv icon

TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference

Add code
Sep 01, 2020
Figure 1 for TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference
Figure 2 for TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference
Figure 3 for TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference
Figure 4 for TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference
Viaarxiv icon

Laconic Deep Learning Computing

Add code
May 10, 2018
Figure 1 for Laconic Deep Learning Computing
Figure 2 for Laconic Deep Learning Computing
Figure 3 for Laconic Deep Learning Computing
Figure 4 for Laconic Deep Learning Computing
Viaarxiv icon

Bit-Tactical: Exploiting Ineffectual Computations in Convolutional Neural Networks: Which, Why, and How

Add code
Mar 09, 2018
Figure 1 for Bit-Tactical: Exploiting Ineffectual Computations in Convolutional Neural Networks: Which, Why, and How
Figure 2 for Bit-Tactical: Exploiting Ineffectual Computations in Convolutional Neural Networks: Which, Why, and How
Figure 3 for Bit-Tactical: Exploiting Ineffectual Computations in Convolutional Neural Networks: Which, Why, and How
Figure 4 for Bit-Tactical: Exploiting Ineffectual Computations in Convolutional Neural Networks: Which, Why, and How
Viaarxiv icon