Picture for Mao Yang

Mao Yang

Microsoft Research

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Add code
Jan 23, 2025
Viaarxiv icon

LUT-DLA: Lookup Table as Efficient Extreme Low-Bit Deep Learning Accelerator

Add code
Jan 18, 2025
Viaarxiv icon

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Add code
Jan 08, 2025
Viaarxiv icon

RedStone: Curating General, Code, Math, and QA Data for Large Language Models

Add code
Dec 04, 2024
Viaarxiv icon

SPFresh: Incremental In-Place Update for Billion-Scale Vector Search

Add code
Oct 18, 2024
Figure 1 for SPFresh: Incremental In-Place Update for Billion-Scale Vector Search
Figure 2 for SPFresh: Incremental In-Place Update for Billion-Scale Vector Search
Figure 3 for SPFresh: Incremental In-Place Update for Billion-Scale Vector Search
Figure 4 for SPFresh: Incremental In-Place Update for Billion-Scale Vector Search
Viaarxiv icon

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Add code
Oct 17, 2024
Figure 1 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Figure 2 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Figure 3 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Figure 4 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Viaarxiv icon

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models

Add code
Sep 25, 2024
Viaarxiv icon

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration

Add code
Aug 12, 2024
Viaarxiv icon

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Add code
Aug 12, 2024
Viaarxiv icon

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

Add code
Jun 25, 2024
Viaarxiv icon