Picture for Shiwei Liu

Shiwei Liu

Layer Collapse in Diffusion Language Models

Add code
May 07, 2026
Viaarxiv icon

ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity

Add code
May 05, 2026
Viaarxiv icon

Motion-Aware Caching for Efficient Autoregressive Video Generation

Add code
May 03, 2026
Viaarxiv icon

When Does Sparsity Mitigate the Curse of Depth in LLMs

Add code
Mar 16, 2026
Viaarxiv icon

Why Diffusion Language Models Struggle with Truly Parallel (Non-Autoregressive) Decoding?

Add code
Feb 26, 2026
Viaarxiv icon

Search or Accelerate: Confidence-Switched Position Beam Search for Diffusion Language Models

Add code
Feb 11, 2026
Viaarxiv icon

Demystifying the Roles of LLM Layers in Retrieval, Knowledge, and Reasoning

Add code
Oct 02, 2025
Viaarxiv icon

Diffusion Language Models Know the Answer Before Decoding

Add code
Aug 27, 2025
Figure 1 for Diffusion Language Models Know the Answer Before Decoding
Figure 2 for Diffusion Language Models Know the Answer Before Decoding
Figure 3 for Diffusion Language Models Know the Answer Before Decoding
Figure 4 for Diffusion Language Models Know the Answer Before Decoding
Viaarxiv icon

AlphaDecay:Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs

Add code
Jun 17, 2025
Viaarxiv icon

A Technical Study into Small Reasoning Language Models

Add code
Jun 16, 2025
Viaarxiv icon