Picture for Minghao Yan

Minghao Yan

Decoding Speculative Decoding

Add code
Feb 02, 2024
Viaarxiv icon

PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices

Add code
Oct 30, 2023
Viaarxiv icon

Distributed SLIDE: Enabling Training Large Neural Networks on Low Bandwidth and Simple CPU-Clusters via Model Parallelism and Sparsity

Add code
Jan 29, 2022
Figure 1 for Distributed SLIDE: Enabling Training Large Neural Networks on Low Bandwidth and Simple CPU-Clusters via Model Parallelism and Sparsity
Figure 2 for Distributed SLIDE: Enabling Training Large Neural Networks on Low Bandwidth and Simple CPU-Clusters via Model Parallelism and Sparsity
Figure 3 for Distributed SLIDE: Enabling Training Large Neural Networks on Low Bandwidth and Simple CPU-Clusters via Model Parallelism and Sparsity
Figure 4 for Distributed SLIDE: Enabling Training Large Neural Networks on Low Bandwidth and Simple CPU-Clusters via Model Parallelism and Sparsity
Viaarxiv icon

PairConnect: A Compute-Efficient MLP Alternative to Attention

Add code
Jun 15, 2021
Figure 1 for PairConnect: A Compute-Efficient MLP Alternative to Attention
Figure 2 for PairConnect: A Compute-Efficient MLP Alternative to Attention
Figure 3 for PairConnect: A Compute-Efficient MLP Alternative to Attention
Figure 4 for PairConnect: A Compute-Efficient MLP Alternative to Attention
Viaarxiv icon