Picture for Lu Yin

Lu Yin

A Survey of Weight Space Learning: Understanding, Representation, and Generation

Add code
Mar 10, 2026
Viaarxiv icon

Progressive Residual Warmup for Language Model Pretraining

Add code
Mar 05, 2026
Viaarxiv icon

Why Diffusion Language Models Struggle with Truly Parallel (Non-Autoregressive) Decoding?

Add code
Feb 26, 2026
Viaarxiv icon

Search or Accelerate: Confidence-Switched Position Beam Search for Diffusion Language Models

Add code
Feb 11, 2026
Viaarxiv icon

Long Chain-of-Thought Compression via Fine-Grained Group Policy Optimization

Add code
Feb 10, 2026
Viaarxiv icon

SemPA: Improving Sentence Embeddings of Large Language Models through Semantic Preference Alignment

Add code
Jan 08, 2026
Viaarxiv icon

Demystifying the Roles of LLM Layers in Retrieval, Knowledge, and Reasoning

Add code
Oct 02, 2025
Viaarxiv icon

Diffusion Language Models Know the Answer Before Decoding

Add code
Aug 27, 2025
Figure 1 for Diffusion Language Models Know the Answer Before Decoding
Figure 2 for Diffusion Language Models Know the Answer Before Decoding
Figure 3 for Diffusion Language Models Know the Answer Before Decoding
Figure 4 for Diffusion Language Models Know the Answer Before Decoding
Viaarxiv icon

AlphaDecay:Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs

Add code
Jun 17, 2025
Viaarxiv icon

AgentAlign: Navigating Safety Alignment in the Shift from Informative to Agentic Large Language Models

Add code
May 29, 2025
Viaarxiv icon