Picture for Vladimir Malinovskii

Vladimir Malinovskii

Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models

Add code
Jan 31, 2025
Viaarxiv icon

Pushing the Limits of Large Language Model Quantization via the Linearity Theorem

Add code
Nov 26, 2024
Figure 1 for Pushing the Limits of Large Language Model Quantization via the Linearity Theorem
Figure 2 for Pushing the Limits of Large Language Model Quantization via the Linearity Theorem
Figure 3 for Pushing the Limits of Large Language Model Quantization via the Linearity Theorem
Figure 4 for Pushing the Limits of Large Language Model Quantization via the Linearity Theorem
Viaarxiv icon

PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression

Add code
May 23, 2024
Viaarxiv icon