Picture for Xuefei Ning

Xuefei Ning

Decouple-Then-Merge: Towards Better Training for Diffusion Models

Add code
Oct 09, 2024
Figure 1 for Decouple-Then-Merge: Towards Better Training for Diffusion Models
Figure 2 for Decouple-Then-Merge: Towards Better Training for Diffusion Models
Figure 3 for Decouple-Then-Merge: Towards Better Training for Diffusion Models
Figure 4 for Decouple-Then-Merge: Towards Better Training for Diffusion Models
Viaarxiv icon

Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding

Add code
Oct 02, 2024
Figure 1 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Figure 2 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Figure 3 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Figure 4 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Viaarxiv icon

CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios

Add code
Sep 16, 2024
Figure 1 for CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
Figure 2 for CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
Figure 3 for CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
Figure 4 for CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios
Viaarxiv icon

Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs

Add code
Jul 01, 2024
Viaarxiv icon

MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression

Add code
Jun 21, 2024
Viaarxiv icon

Can LLMs Learn by Teaching? A Preliminary Study

Add code
Jun 20, 2024
Viaarxiv icon

DiTFastAttn: Attention Compression for Diffusion Transformer Models

Add code
Jun 12, 2024
Figure 1 for DiTFastAttn: Attention Compression for Diffusion Transformer Models
Figure 2 for DiTFastAttn: Attention Compression for Diffusion Transformer Models
Figure 3 for DiTFastAttn: Attention Compression for Diffusion Transformer Models
Figure 4 for DiTFastAttn: Attention Compression for Diffusion Transformer Models
Viaarxiv icon

ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation

Add code
Jun 04, 2024
Viaarxiv icon

MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization

Add code
May 30, 2024
Viaarxiv icon

HetHub: A Heterogeneous distributed hybrid training system for large-scale models

Add code
May 25, 2024
Viaarxiv icon