Picture for Zhekai Zhang

Zhekai Zhang

SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Add code
Nov 07, 2024
Viaarxiv icon

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers

Add code
Oct 15, 2024
Viaarxiv icon

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Add code
May 07, 2024
Viaarxiv icon

SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning

Add code
Jan 04, 2021
Figure 1 for SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Figure 2 for SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Figure 3 for SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Figure 4 for SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Viaarxiv icon

Benchmark Visual Question Answer Models by using Focus Map

Add code
Jan 13, 2018
Figure 1 for Benchmark Visual Question Answer Models by using Focus Map
Figure 2 for Benchmark Visual Question Answer Models by using Focus Map
Figure 3 for Benchmark Visual Question Answer Models by using Focus Map
Figure 4 for Benchmark Visual Question Answer Models by using Focus Map
Viaarxiv icon