Abstract:Neural image compression, based on auto-encoders and overfitted representations, relies on a latent representation of the coded signal. This representation needs to be compact and uses low resolution feature maps. In the decoding process, those latents are upsampled and filtered using stacks of convolution filters and non linear elements to recover the decoded image. Therefore, the upsampling process is crucial in the design of a neural coding scheme and is of particular importance for overfitted codecs where the network parameters, including the upsampling filters, are part of the representation. This paper addresses the improvement of the upsampling process in order to reduce its complexity and limit the number of parameters. A new upsampling structure is presented whose improvements are illustrated within the Cool-Chic overfitted image coding framework. The proposed approach offers a rate reduction of 4.7%. The code is provided.
Abstract:We propose a lightweight learned video codec with 900 multiplications per decoded pixel and 800 parameters overall. To the best of our knowledge, this is one of the neural video codecs with the lowest decoding complexity. It is built upon the overfitted image codec Cool-chic and supplements it with an inter coding module to leverage the video's temporal redundancies. The proposed model is able to compress videos using both low-delay and random access configurations and achieves rate-distortion close to AVC while out-performing other overfitted codecs such as FFNeRV. The system is made open-source: orange-opensource.github.io/Cool-Chic.
Abstract:This paper summarises the design of the Cool-Chic candidate for the Challenge on Learned Image Compression. This candidate attempts to demonstrate that neural coding methods can lead to low complexity and lightweight image decoders while still offering competitive performance. The approach is based on the already published overfitted lightweight neural networks Cool-Chic, further adapted to the human subjective viewing targeted in this challenge.
Abstract:This paper summarises the design of the candidate ED for the Challenge on Learned Image Compression 2024. This candidate aims at providing an anchor based on conventional coding technologies to the learning-based approaches mostly targeted in the challenge. The proposed candidate is based on the Enhanced Compression Model (ECM) developed at JVET, the Joint Video Experts Team of ITU-T VCEG and ISO/IEC MPEG. Here, ECM is adapted to the challenge objective: to maximise the perceived quality, the encoding is performed according to a perceptual metric, also the sequence selection is performed in a perceptual manner to fit the target bit per pixel objectives. The primary objective of this candidate is to assess the recent developments in video coding standardisation and in parallel to evaluate the progress made by learning-based techniques. To this end, this paper explains how to generate coded images fulfilling the challenge requirements, in a reproducible way, targeting the maximum performance.
Abstract:We propose a neural image codec at reduced complexity which overfits the decoder parameters to each input image. While autoencoders perform up to a million multiplications per decoded pixel, the proposed approach only requires 2300 multiplications per pixel. Albeit low-complexity, the method rivals autoencoder performance and surpasses HEVC performance under various coding conditions. Additional lightweight modules and an improved training process provide a 14% rate reduction with respect to previous overfitted codecs, while offering a similar complexity. This work is made open-source at https://orange-opensource.github.io/Cool-Chic/