Picture for Xuyang Shen

Xuyang Shen

MiniMax-01: Scaling Foundation Models with Lightning Attention

Add code
Jan 14, 2025
Viaarxiv icon

Scaling Laws for Linear Complexity Language Models

Add code
Jun 24, 2024
Viaarxiv icon

You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet

Add code
May 31, 2024
Figure 1 for You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet
Figure 2 for You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet
Figure 3 for You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet
Figure 4 for You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet
Viaarxiv icon

Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention

Add code
May 27, 2024
Viaarxiv icon

Unlocking the Secrets of Linear Complexity Sequence Model from A Unified Perspective

Add code
May 27, 2024
Viaarxiv icon

TAVGBench: Benchmarking Text to Audible-Video Generation

Add code
Apr 22, 2024
Viaarxiv icon

HGRN2: Gated Linear RNNs with State Expansion

Add code
Apr 11, 2024
Viaarxiv icon

Linear Attention Sequence Parallelism

Add code
Apr 03, 2024
Figure 1 for Linear Attention Sequence Parallelism
Figure 2 for Linear Attention Sequence Parallelism
Figure 3 for Linear Attention Sequence Parallelism
Figure 4 for Linear Attention Sequence Parallelism
Viaarxiv icon

CO2: Efficient Distributed Training with Full Communication-Computation Overlap

Add code
Jan 29, 2024
Viaarxiv icon

Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Add code
Jan 15, 2024
Figure 1 for Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Figure 2 for Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Figure 3 for Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Figure 4 for Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Viaarxiv icon