Picture for Shiwei Liu

Shiwei Liu

Learning from the Self-future: On-policy Self-distillation for dLLMs

Add code
Jun 16, 2026
Viaarxiv icon

ShallowBench: Benchmarking Generative Drug Design Models on Shallow-Pocket Targets

Add code
Jun 04, 2026
Viaarxiv icon

AlphaQ: Calibration-Free Bit Allocation for Mixture-of-Experts Quantization

Add code
Jun 03, 2026
Viaarxiv icon

One LR Doesn't Fit All: Heavy-Tail Guided Layerwise Learning Rates for LLMs

Add code
May 21, 2026
Viaarxiv icon

Layer Collapse in Diffusion Language Models

Add code
May 07, 2026
Viaarxiv icon

ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity

Add code
May 05, 2026
Viaarxiv icon

Motion-Aware Caching for Efficient Autoregressive Video Generation

Add code
May 03, 2026
Viaarxiv icon

When Does Sparsity Mitigate the Curse of Depth in LLMs

Add code
Mar 16, 2026
Viaarxiv icon

Why Diffusion Language Models Struggle with Truly Parallel (Non-Autoregressive) Decoding?

Add code
Feb 26, 2026
Viaarxiv icon

Search or Accelerate: Confidence-Switched Position Beam Search for Diffusion Language Models

Add code
Feb 11, 2026
Viaarxiv icon