Picture for Ruslan Svirschevski

Ruslan Svirschevski

Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization

Add code
Aug 31, 2024
Viaarxiv icon

SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices

Add code
Jun 04, 2024
Viaarxiv icon

Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding

Add code
Feb 29, 2024
Viaarxiv icon

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

Add code
Jun 05, 2023
Viaarxiv icon