Picture for Sheng-Chun Kao

Sheng-Chun Kao

NonGEMM Bench: Understanding the Performance Horizon of the Latest ML Workloads with NonGEMM Workloads

Add code
Apr 17, 2024
Viaarxiv icon

Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers

Add code
Feb 07, 2024
Figure 1 for Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers
Figure 2 for Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers
Figure 3 for Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers
Figure 4 for Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers
Viaarxiv icon

JaxPruner: A concise library for sparsity research

Add code
May 02, 2023
Figure 1 for JaxPruner: A concise library for sparsity research
Figure 2 for JaxPruner: A concise library for sparsity research
Figure 3 for JaxPruner: A concise library for sparsity research
Viaarxiv icon

Demystifying Map Space Exploration for NPUs

Add code
Oct 07, 2022
Figure 1 for Demystifying Map Space Exploration for NPUs
Figure 2 for Demystifying Map Space Exploration for NPUs
Figure 3 for Demystifying Map Space Exploration for NPUs
Figure 4 for Demystifying Map Space Exploration for NPUs
Viaarxiv icon

Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask

Add code
Sep 15, 2022
Figure 1 for Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask
Figure 2 for Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask
Figure 3 for Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask
Figure 4 for Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask
Viaarxiv icon

DiGamma: Domain-aware Genetic Algorithm for HW-Mapping Co-optimization for DNN Accelerators

Add code
Jan 26, 2022
Figure 1 for DiGamma: Domain-aware Genetic Algorithm for HW-Mapping Co-optimization for DNN Accelerators
Figure 2 for DiGamma: Domain-aware Genetic Algorithm for HW-Mapping Co-optimization for DNN Accelerators
Figure 3 for DiGamma: Domain-aware Genetic Algorithm for HW-Mapping Co-optimization for DNN Accelerators
Figure 4 for DiGamma: Domain-aware Genetic Algorithm for HW-Mapping Co-optimization for DNN Accelerators
Viaarxiv icon

DNNFuser: Generative Pre-Trained Transformer as a Generalized Mapper for Layer Fusion in DNN Accelerators

Add code
Jan 26, 2022
Figure 1 for DNNFuser: Generative Pre-Trained Transformer as a Generalized Mapper for Layer Fusion in DNN Accelerators
Figure 2 for DNNFuser: Generative Pre-Trained Transformer as a Generalized Mapper for Layer Fusion in DNN Accelerators
Figure 3 for DNNFuser: Generative Pre-Trained Transformer as a Generalized Mapper for Layer Fusion in DNN Accelerators
Figure 4 for DNNFuser: Generative Pre-Trained Transformer as a Generalized Mapper for Layer Fusion in DNN Accelerators
Viaarxiv icon

ATTACC the Quadratic Bottleneck of Attention Layers

Add code
Jul 13, 2021
Figure 1 for ATTACC the Quadratic Bottleneck of Attention Layers
Figure 2 for ATTACC the Quadratic Bottleneck of Attention Layers
Figure 3 for ATTACC the Quadratic Bottleneck of Attention Layers
Figure 4 for ATTACC the Quadratic Bottleneck of Attention Layers
Viaarxiv icon

Domain-specific Genetic Algorithm for Multi-tenant DNNAccelerator Scheduling

Add code
Apr 30, 2021
Figure 1 for Domain-specific Genetic Algorithm for Multi-tenant DNNAccelerator Scheduling
Figure 2 for Domain-specific Genetic Algorithm for Multi-tenant DNNAccelerator Scheduling
Figure 3 for Domain-specific Genetic Algorithm for Multi-tenant DNNAccelerator Scheduling
Figure 4 for Domain-specific Genetic Algorithm for Multi-tenant DNNAccelerator Scheduling
Viaarxiv icon

ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators using Reinforcement Learning

Add code
Sep 04, 2020
Figure 1 for ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators using Reinforcement Learning
Figure 2 for ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators using Reinforcement Learning
Figure 3 for ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators using Reinforcement Learning
Figure 4 for ConfuciuX: Autonomous Hardware Resource Assignment for DNN Accelerators using Reinforcement Learning
Viaarxiv icon