Picture for Ziwei He

Ziwei He

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

Add code
Jun 17, 2025
Viaarxiv icon

Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache

Add code
Jun 13, 2025
Viaarxiv icon

Pretraining Language Models to Ponder in Continuous Space

Add code
May 27, 2025
Viaarxiv icon

WeightedKV: Attention Scores Weighted Key-Value Cache Merging for Large Language Models

Add code
Mar 03, 2025
Viaarxiv icon

TreeKV: Smooth Key-Value Cache Compression with Tree Structures

Add code
Jan 09, 2025
Viaarxiv icon

Towards Controlled Table-to-Text Generation with Scientific Reasoning

Add code
Dec 08, 2023
Viaarxiv icon

Fovea Transformer: Efficient Long-Context Modeling with Structured Fine-to-Coarse Attention

Add code
Nov 13, 2023
Viaarxiv icon

Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator

Add code
May 24, 2023
Viaarxiv icon

Few-Shot Table-to-Text Generation with Prompt Planning and Knowledge Memorization

Add code
Feb 24, 2023
Viaarxiv icon

Few-Shot Table-to-Text Generation with Prompt-based Adapter

Add code
Feb 24, 2023
Viaarxiv icon