Picture for Lingxiao Ma

Lingxiao Ma

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration

Add code
Aug 12, 2024
Viaarxiv icon

Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor

Add code
Aug 09, 2024
Viaarxiv icon

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

Add code
Jun 25, 2024
Viaarxiv icon

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Add code
Feb 27, 2024
Viaarxiv icon

BitNet: Scaling 1-bit Transformers for Large Language Models

Add code
Oct 17, 2023
Viaarxiv icon

FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement

Add code
Apr 08, 2023
Viaarxiv icon

SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation

Add code
Jan 26, 2023
Viaarxiv icon

Dense-to-Sparse Gate for Mixture-of-Experts

Add code
Dec 29, 2021
Figure 1 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 2 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 3 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 4 for Dense-to-Sparse Gate for Mixture-of-Experts
Viaarxiv icon

Architectural Implications of Graph Neural Networks

Add code
Sep 02, 2020
Figure 1 for Architectural Implications of Graph Neural Networks
Figure 2 for Architectural Implications of Graph Neural Networks
Figure 3 for Architectural Implications of Graph Neural Networks
Figure 4 for Architectural Implications of Graph Neural Networks
Viaarxiv icon

Towards Efficient Large-Scale Graph Neural Network Computing

Add code
Oct 19, 2018
Figure 1 for Towards Efficient Large-Scale Graph Neural Network Computing
Figure 2 for Towards Efficient Large-Scale Graph Neural Network Computing
Figure 3 for Towards Efficient Large-Scale Graph Neural Network Computing
Figure 4 for Towards Efficient Large-Scale Graph Neural Network Computing
Viaarxiv icon