Picture for Lingxiao Ma

Lingxiao Ma

WaferLLM: A Wafer-Scale LLM Inference System

Add code
Feb 06, 2025
Viaarxiv icon

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration

Add code
Aug 12, 2024
Viaarxiv icon

Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor

Add code
Aug 09, 2024
Viaarxiv icon

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

Add code
Jun 25, 2024
Viaarxiv icon

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Add code
Feb 27, 2024
Viaarxiv icon

BitNet: Scaling 1-bit Transformers for Large Language Models

Add code
Oct 17, 2023
Viaarxiv icon

FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement

Add code
Apr 08, 2023
Viaarxiv icon

SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation

Add code
Jan 26, 2023
Figure 1 for SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation
Figure 2 for SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation
Figure 3 for SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation
Figure 4 for SparDA: Accelerating Dynamic Sparse Deep Neural Networks via Sparse-Dense Transformation
Viaarxiv icon

Dense-to-Sparse Gate for Mixture-of-Experts

Add code
Dec 29, 2021
Figure 1 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 2 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 3 for Dense-to-Sparse Gate for Mixture-of-Experts
Figure 4 for Dense-to-Sparse Gate for Mixture-of-Experts
Viaarxiv icon

Architectural Implications of Graph Neural Networks

Add code
Sep 02, 2020
Figure 1 for Architectural Implications of Graph Neural Networks
Figure 2 for Architectural Implications of Graph Neural Networks
Figure 3 for Architectural Implications of Graph Neural Networks
Figure 4 for Architectural Implications of Graph Neural Networks
Viaarxiv icon