Picture for Kai Han

Kai Han

SpeCache: Speculative Key-Value Caching for Efficient Generation of LLMs

Add code
Mar 20, 2025
Viaarxiv icon

Mixture of Lookup Experts

Add code
Mar 20, 2025
Viaarxiv icon

Adaptive Label Correction for Robust Medical Image Segmentation with Noisy Labels

Add code
Mar 15, 2025
Viaarxiv icon

Post-Training Quantization for Diffusion Transformer via Hierarchical Timestep Grouping

Add code
Mar 10, 2025
Viaarxiv icon

DynCIM: Dynamic Curriculum for Imbalanced Multimodal Learning

Add code
Mar 09, 2025
Viaarxiv icon

ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval

Add code
Feb 21, 2025
Viaarxiv icon

VipDiff: Towards Coherent and Diverse Video Inpainting via Training-free Denoising Diffusion Models

Add code
Jan 21, 2025
Figure 1 for VipDiff: Towards Coherent and Diverse Video Inpainting via Training-free Denoising Diffusion Models
Figure 2 for VipDiff: Towards Coherent and Diverse Video Inpainting via Training-free Denoising Diffusion Models
Figure 3 for VipDiff: Towards Coherent and Diverse Video Inpainting via Training-free Denoising Diffusion Models
Figure 4 for VipDiff: Towards Coherent and Diverse Video Inpainting via Training-free Denoising Diffusion Models
Viaarxiv icon

Parallel Sequence Modeling via Generalized Spatial Propagation Network

Add code
Jan 21, 2025
Figure 1 for Parallel Sequence Modeling via Generalized Spatial Propagation Network
Figure 2 for Parallel Sequence Modeling via Generalized Spatial Propagation Network
Figure 3 for Parallel Sequence Modeling via Generalized Spatial Propagation Network
Figure 4 for Parallel Sequence Modeling via Generalized Spatial Propagation Network
Viaarxiv icon

Eve: Efficient Multimodal Vision Language Models with Elastic Visual Experts

Add code
Jan 08, 2025
Figure 1 for Eve: Efficient Multimodal Vision Language Models with Elastic Visual Experts
Figure 2 for Eve: Efficient Multimodal Vision Language Models with Elastic Visual Experts
Figure 3 for Eve: Efficient Multimodal Vision Language Models with Elastic Visual Experts
Figure 4 for Eve: Efficient Multimodal Vision Language Models with Elastic Visual Experts
Viaarxiv icon

PruneVid: Visual Token Pruning for Efficient Video Large Language Models

Add code
Dec 20, 2024
Viaarxiv icon