Picture for Yang Sui

Yang Sui

Henry

Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models

Add code
Mar 20, 2025
Viaarxiv icon

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Add code
Mar 20, 2025
Viaarxiv icon

Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization

Add code
Feb 06, 2025
Viaarxiv icon

Understanding Artificial Neural Network's Behavior from Neuron Activation Perspective

Add code
Dec 24, 2024
Viaarxiv icon

SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device

Add code
Dec 13, 2024
Viaarxiv icon

DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models

Add code
Nov 22, 2024
Viaarxiv icon

MoE-I$^2$: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition

Add code
Nov 01, 2024
Figure 1 for MoE-I$^2$: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
Figure 2 for MoE-I$^2$: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
Figure 3 for MoE-I$^2$: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
Figure 4 for MoE-I$^2$: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
Viaarxiv icon

BitsFusion: 1.99 bits Weight Quantization of Diffusion Model

Add code
Jun 06, 2024
Figure 1 for BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
Figure 2 for BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
Figure 3 for BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
Figure 4 for BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
Viaarxiv icon

Combining Experimental and Historical Data for Policy Evaluation

Add code
Jun 01, 2024
Viaarxiv icon

DisDet: Exploring Detectability of Backdoor Attack on Diffusion Models

Add code
Feb 05, 2024
Viaarxiv icon