Picture for Mao Yang

Mao Yang

Microsoft Research

LongRoPE2: Near-Lossless LLM Context Window Scaling

Add code
Feb 27, 2025
Viaarxiv icon

AttentionEngine: A Versatile Framework for Efficient Attention Mechanisms on Diverse Hardware Platforms

Add code
Feb 21, 2025
Viaarxiv icon

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Add code
Jan 23, 2025
Figure 1 for Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models
Figure 2 for Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models
Figure 3 for Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models
Figure 4 for Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models
Viaarxiv icon

LUT-DLA: Lookup Table as Efficient Extreme Low-Bit Deep Learning Accelerator

Add code
Jan 18, 2025
Viaarxiv icon

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Add code
Jan 08, 2025
Figure 1 for rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Figure 2 for rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Figure 3 for rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Figure 4 for rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Viaarxiv icon

RedStone: Curating General, Code, Math, and QA Data for Large Language Models

Add code
Dec 04, 2024
Viaarxiv icon

SPFresh: Incremental In-Place Update for Billion-Scale Vector Search

Add code
Oct 18, 2024
Figure 1 for SPFresh: Incremental In-Place Update for Billion-Scale Vector Search
Figure 2 for SPFresh: Incremental In-Place Update for Billion-Scale Vector Search
Figure 3 for SPFresh: Incremental In-Place Update for Billion-Scale Vector Search
Figure 4 for SPFresh: Incremental In-Place Update for Billion-Scale Vector Search
Viaarxiv icon

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Add code
Oct 17, 2024
Figure 1 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Figure 2 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Figure 3 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Figure 4 for SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Viaarxiv icon

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models

Add code
Sep 25, 2024
Figure 1 for VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Figure 2 for VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Figure 3 for VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Figure 4 for VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models
Viaarxiv icon

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Add code
Aug 12, 2024
Viaarxiv icon