Picture for Xuyang Shen

Xuyang Shen

Scaling Laws for Linear Complexity Language Models

Add code
Jun 24, 2024
Viaarxiv icon

You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet

Add code
May 31, 2024
Viaarxiv icon

Unlocking the Secrets of Linear Complexity Sequence Model from A Unified Perspective

Add code
May 27, 2024
Viaarxiv icon

Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention

Add code
May 27, 2024
Viaarxiv icon

TAVGBench: Benchmarking Text to Audible-Video Generation

Add code
Apr 22, 2024
Viaarxiv icon

HGRN2: Gated Linear RNNs with State Expansion

Add code
Apr 11, 2024
Viaarxiv icon

Linear Attention Sequence Parallelism

Add code
Apr 03, 2024
Figure 1 for Linear Attention Sequence Parallelism
Figure 2 for Linear Attention Sequence Parallelism
Figure 3 for Linear Attention Sequence Parallelism
Figure 4 for Linear Attention Sequence Parallelism
Viaarxiv icon

CO2: Efficient Distributed Training with Full Communication-Computation Overlap

Add code
Jan 29, 2024
Viaarxiv icon

Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Add code
Jan 15, 2024
Viaarxiv icon

Scaling TransNormer to 175 Billion Parameters

Add code
Jul 27, 2023
Viaarxiv icon