Picture for Jilong Xue

Jilong Xue

GRIN: GRadient-INformed MoE

Add code
Sep 18, 2024
Figure 1 for GRIN: GRadient-INformed MoE
Figure 2 for GRIN: GRadient-INformed MoE
Figure 3 for GRIN: GRadient-INformed MoE
Figure 4 for GRIN: GRadient-INformed MoE
Viaarxiv icon

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration

Add code
Aug 12, 2024
Viaarxiv icon

Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor

Add code
Aug 09, 2024
Viaarxiv icon

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Add code
Feb 27, 2024
Viaarxiv icon

Retentive Network: A Successor to Transformer for Large Language Models

Add code
Aug 09, 2023
Viaarxiv icon

FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement

Add code
Apr 08, 2023
Viaarxiv icon

Dense-to-Sparse Gate for Mixture-of-Experts

Add code
Dec 29, 2021
Figure 1 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 2 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 3 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 4 for Dense-to-Sparse Gate for Mixture-of-Experts
Viaarxiv icon

Towards Efficient Large-Scale Graph Neural Network Computing

Add code
Oct 19, 2018
Figure 1 for Towards Efficient Large-Scale Graph Neural Network Computing
Figure 2 for Towards Efficient Large-Scale Graph Neural Network Computing
Figure 3 for Towards Efficient Large-Scale Graph Neural Network Computing
Figure 4 for Towards Efficient Large-Scale Graph Neural Network Computing
Viaarxiv icon

RPC Considered Harmful: Fast Distributed Deep Learning on RDMA

Add code
May 22, 2018
Figure 1 for RPC Considered Harmful: Fast Distributed Deep Learning on RDMA
Figure 2 for RPC Considered Harmful: Fast Distributed Deep Learning on RDMA
Figure 3 for RPC Considered Harmful: Fast Distributed Deep Learning on RDMA
Figure 4 for RPC Considered Harmful: Fast Distributed Deep Learning on RDMA
Viaarxiv icon