Picture for Wenyuan Yu

Wenyuan Yu

Static Batching of Irregular Workloads on GPUs: Framework and Application to Efficient MoE Model Inference

Add code
Jan 27, 2025
Figure 1 for Static Batching of Irregular Workloads on GPUs: Framework and Application to Efficient MoE Model Inference
Viaarxiv icon

Qwen2.5-1M Technical Report

Add code
Jan 26, 2025
Viaarxiv icon

Exact Acceleration of Subgraph Graph Neural Networks by Eliminating Computation Redundancy

Add code
Dec 24, 2024
Figure 1 for Exact Acceleration of Subgraph Graph Neural Networks by Eliminating Computation Redundancy
Figure 2 for Exact Acceleration of Subgraph Graph Neural Networks by Eliminating Computation Redundancy
Figure 3 for Exact Acceleration of Subgraph Graph Neural Networks by Eliminating Computation Redundancy
Figure 4 for Exact Acceleration of Subgraph Graph Neural Networks by Eliminating Computation Redundancy
Viaarxiv icon

AsymKV: Enabling 1-Bit Quantization of KV Cache with Layer-Wise Asymmetric Quantization Configurations

Add code
Oct 17, 2024
Viaarxiv icon

Unicron: Economizing Self-Healing LLM Training at Scale

Add code
Dec 30, 2023
Viaarxiv icon

LON-GNN: Spectral GNNs with Learnable Orthonormal Basis

Add code
Mar 30, 2023
Viaarxiv icon