Picture for Chengquan Jiang

Chengquan Jiang

FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion

Add code
Jun 12, 2024
Figure 1 for FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion
Figure 2 for FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion
Figure 3 for FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion
Figure 4 for FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion
Viaarxiv icon

ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs

Add code
Oct 06, 2022
Figure 1 for ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Figure 2 for ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Figure 3 for ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Figure 4 for ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Viaarxiv icon