Picture for Ali Hadi Zadeh

Ali Hadi Zadeh

Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training

Add code
Apr 28, 2022
Figure 1 for Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training
Figure 2 for Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training
Figure 3 for Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training
Figure 4 for Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training
Viaarxiv icon

Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models

Add code
Mar 23, 2022
Figure 1 for Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models
Figure 2 for Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models
Figure 3 for Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models
Figure 4 for Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models
Viaarxiv icon

FPRaker: A Processing Element For Accelerating Neural Network Training

Add code
Oct 15, 2020
Figure 1 for FPRaker: A Processing Element For Accelerating Neural Network Training
Figure 2 for FPRaker: A Processing Element For Accelerating Neural Network Training
Figure 3 for FPRaker: A Processing Element For Accelerating Neural Network Training
Figure 4 for FPRaker: A Processing Element For Accelerating Neural Network Training
Viaarxiv icon

TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference

Add code
Sep 01, 2020
Figure 1 for TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference
Figure 2 for TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference
Figure 3 for TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference
Figure 4 for TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference
Viaarxiv icon

GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference

Add code
May 08, 2020
Figure 1 for GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference
Figure 2 for GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference
Figure 3 for GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference
Figure 4 for GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference
Viaarxiv icon