Picture for Yang Sui

Yang Sui

Henry

LowDiff: Efficient Diffusion Sampling with Low-Resolution Condition

Add code
Sep 18, 2025
Viaarxiv icon

When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios

Add code
Jul 27, 2025
Figure 1 for When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios
Figure 2 for When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios
Figure 3 for When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios
Figure 4 for When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios
Viaarxiv icon

Multi-task Learning for Heterogeneous Multi-source Block-Wise Missing Data

Add code
May 30, 2025
Viaarxiv icon

Multi-task Learning for Heterogeneous Data via Integrating Shared and Task-Specific Encodings

Add code
May 30, 2025
Viaarxiv icon

HoliTom: Holistic Token Merging for Fast Video Large Language Models

Add code
May 28, 2025
Viaarxiv icon

AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models

Add code
May 28, 2025
Viaarxiv icon

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float

Add code
Apr 15, 2025
Figure 1 for 70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
Figure 2 for 70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
Figure 3 for 70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
Figure 4 for 70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
Viaarxiv icon

Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models

Add code
Mar 20, 2025
Figure 1 for Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models
Figure 2 for Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models
Figure 3 for Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models
Figure 4 for Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models
Viaarxiv icon

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Add code
Mar 20, 2025
Viaarxiv icon

Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization

Add code
Feb 06, 2025
Viaarxiv icon