Picture for Panda

Panda

Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference

Add code
Jan 17, 2024
Viaarxiv icon

Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference

Add code
May 24, 2023
Figure 1 for Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference
Figure 2 for Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference
Figure 3 for Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference
Figure 4 for Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference
Viaarxiv icon

Performance Characterization of using Quantization for DNN Inference on Edge Devices: Extended Version

Add code
Mar 09, 2023
Figure 1 for Performance Characterization of using Quantization for DNN Inference on Edge Devices: Extended Version
Figure 2 for Performance Characterization of using Quantization for DNN Inference on Edge Devices: Extended Version
Figure 3 for Performance Characterization of using Quantization for DNN Inference on Edge Devices: Extended Version
Figure 4 for Performance Characterization of using Quantization for DNN Inference on Edge Devices: Extended Version
Viaarxiv icon

HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow

Add code
Nov 12, 2019
Figure 1 for HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow
Figure 2 for HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow
Figure 3 for HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow
Figure 4 for HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow
Viaarxiv icon