Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Marco Cagnanzzo

RD Efficient FPGA Deployment of Learned Image Compression: Knowledge Distillation and Hybrid Quantization

Mar 05, 2025

Mazouz Alaa Eddine, Sumanta Chaudhuri, Marco Cagnanzzo, Mihai Mitrea, Enzo Tartaglione, Attilio Fiandrotti

Figure 1 for RD Efficient FPGA Deployment of Learned Image Compression: Knowledge Distillation and Hybrid Quantization

Figure 2 for RD Efficient FPGA Deployment of Learned Image Compression: Knowledge Distillation and Hybrid Quantization

Figure 3 for RD Efficient FPGA Deployment of Learned Image Compression: Knowledge Distillation and Hybrid Quantization

Figure 4 for RD Efficient FPGA Deployment of Learned Image Compression: Knowledge Distillation and Hybrid Quantization

Abstract:Learnable Image Compression (LIC) has shown the potential to outperform standardized video codecs in RD efficiency, prompting the research for hardware-friendly implementations. Most existing LIC hardware implementations prioritize latency to RD-efficiency and through an extensive exploration of the hardware design space. We present a novel design paradigm where the burden of tuning the design for a specific hardware platform is shifted towards model dimensioning and without compromising on RD-efficiency. First, we design a framework for distilling a leaner student LIC model from a reference teacher: by tuning a single model hyperparameters, we can meet the constraints of different hardware platforms without a complex hardware design exploration. Second, we propose a hardware-friendly implementation of the Generalized Divisive Normalization (GDN) activation that preserves RD efficiency even post parameter quantization. Third, we design a pipelined FPGA configuration which takes full advantage of available FPGA resources by leveraging parallel processing and optimizing resource allocation. Our experiments with a state of the art LIC model show that we outperform all existing FPGA implementations while performing very close to the original model in terms of RD efficiency.

Via

Access Paper or Ask Questions