Picture for Qiyang Min

Qiyang Min

Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts

Add code
Mar 20, 2025
Viaarxiv icon

Frac-Connections: Fractional Extension of Hyper-Connections

Add code
Mar 18, 2025
Viaarxiv icon

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

Add code
Jan 28, 2025
Viaarxiv icon

Ultra-Sparse Memory Network

Add code
Nov 19, 2024
Viaarxiv icon

Hyper-Connections

Add code
Sep 29, 2024
Figure 1 for Hyper-Connections
Figure 2 for Hyper-Connections
Figure 3 for Hyper-Connections
Figure 4 for Hyper-Connections
Viaarxiv icon