Picture for Bradley McDanel

Bradley McDanel

AMUSD: Asynchronous Multi-Device Speculative Decoding for LLM Acceleration

Add code
Oct 22, 2024
Viaarxiv icon

Accelerating Vision Transformer Training via a Patch Sampling Schedule

Add code
Aug 19, 2022
Figure 1 for Accelerating Vision Transformer Training via a Patch Sampling Schedule
Figure 2 for Accelerating Vision Transformer Training via a Patch Sampling Schedule
Figure 3 for Accelerating Vision Transformer Training via a Patch Sampling Schedule
Figure 4 for Accelerating Vision Transformer Training via a Patch Sampling Schedule
Viaarxiv icon

Accelerating DNN Training with Structured Data Gradient Pruning

Add code
Feb 01, 2022
Figure 1 for Accelerating DNN Training with Structured Data Gradient Pruning
Figure 2 for Accelerating DNN Training with Structured Data Gradient Pruning
Figure 3 for Accelerating DNN Training with Structured Data Gradient Pruning
Figure 4 for Accelerating DNN Training with Structured Data Gradient Pruning
Viaarxiv icon

FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding

Add code
Oct 28, 2021
Figure 1 for FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding
Figure 2 for FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding
Figure 3 for FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding
Figure 4 for FAST: DNN Training Under Variable Precision Block Floating Point with Stochastic Rounding
Viaarxiv icon

Term Revealing: Furthering Quantization at Run Time on Quantized DNNs

Add code
Jul 26, 2020
Figure 1 for Term Revealing: Furthering Quantization at Run Time on Quantized DNNs
Figure 2 for Term Revealing: Furthering Quantization at Run Time on Quantized DNNs
Figure 3 for Term Revealing: Furthering Quantization at Run Time on Quantized DNNs
Figure 4 for Term Revealing: Furthering Quantization at Run Time on Quantized DNNs
Viaarxiv icon

Full-stack Optimization for Accelerating CNNs with FPGA Validation

Add code
May 01, 2019
Figure 1 for Full-stack Optimization for Accelerating CNNs with FPGA Validation
Figure 2 for Full-stack Optimization for Accelerating CNNs with FPGA Validation
Figure 3 for Full-stack Optimization for Accelerating CNNs with FPGA Validation
Figure 4 for Full-stack Optimization for Accelerating CNNs with FPGA Validation
Viaarxiv icon

Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization

Add code
Nov 07, 2018
Figure 1 for Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization
Figure 2 for Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization
Figure 3 for Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization
Figure 4 for Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization
Viaarxiv icon

Incomplete Dot Products for Dynamic Computation Scaling in Neural Network Inference

Add code
Oct 21, 2017
Figure 1 for Incomplete Dot Products for Dynamic Computation Scaling in Neural Network Inference
Figure 2 for Incomplete Dot Products for Dynamic Computation Scaling in Neural Network Inference
Figure 3 for Incomplete Dot Products for Dynamic Computation Scaling in Neural Network Inference
Figure 4 for Incomplete Dot Products for Dynamic Computation Scaling in Neural Network Inference
Viaarxiv icon

Embedded Binarized Neural Networks

Add code
Sep 06, 2017
Figure 1 for Embedded Binarized Neural Networks
Figure 2 for Embedded Binarized Neural Networks
Figure 3 for Embedded Binarized Neural Networks
Figure 4 for Embedded Binarized Neural Networks
Viaarxiv icon

BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks

Add code
Sep 06, 2017
Figure 1 for BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks
Figure 2 for BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks
Figure 3 for BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks
Figure 4 for BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks
Viaarxiv icon