Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ramin Ghaznavi-Youvalari

Image coding for machines: an end-to-end learned approach

Aug 30, 2021

Nam Le, Honglei Zhang, Francesco Cricri, Ramin Ghaznavi-Youvalari, Esa Rahtu

Figure 1 for Image coding for machines: an end-to-end learned approach

Figure 2 for Image coding for machines: an end-to-end learned approach

Figure 3 for Image coding for machines: an end-to-end learned approach

Figure 4 for Image coding for machines: an end-to-end learned approach

Abstract:Over recent years, deep learning-based computer vision systems have been applied to images at an ever-increasing pace, oftentimes representing the only type of consumption for those images. Given the dramatic explosion in the number of images generated per day, a question arises: how much better would an image codec targeting machine-consumption perform against state-of-the-art codecs targeting human-consumption? In this paper, we propose an image codec for machines which is neural network (NN) based and end-to-end learned. In particular, we propose a set of training strategies that address the delicate problem of balancing competing loss functions, such as computer vision task losses, image distortion losses, and rate loss. Our experimental results show that our NN-based codec outperforms the state-of-the-art Versa-tile Video Coding (VVC) standard on the object detection and instance segmentation tasks, achieving -37.87% and -32.90% of BD-rate gain, respectively, while being fast thanks to its compact size. To the best of our knowledge, this is the first end-to-end learned machine-targeted image codec.

* 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2021), 2021, pp. 1590-1594
* Fixed a couple of mistakes since the version accepted in IEEE ICASSP2021

Via

Access Paper or Ask Questions

Learned Image Coding for Machines: A Content-Adaptive Approach

Aug 23, 2021

Nam Le, Honglei Zhang, Francesco Cricri, Ramin Ghaznavi-Youvalari, Hamed Rezazadegan Tavakoli, Esa Rahtu

Figure 1 for Learned Image Coding for Machines: A Content-Adaptive Approach

Figure 2 for Learned Image Coding for Machines: A Content-Adaptive Approach

Figure 3 for Learned Image Coding for Machines: A Content-Adaptive Approach

Figure 4 for Learned Image Coding for Machines: A Content-Adaptive Approach

Abstract:Today, according to the Cisco Annual Internet Report (2018-2023), the fastest-growing category of Internet traffic is machine-to-machine communication. In particular, machine-to-machine communication of images and videos represents a new challenge and opens up new perspectives in the context of data compression. One possible solution approach consists of adapting current human-targeted image and video coding standards to the use case of machine consumption. Another approach consists of developing completely new compression paradigms and architectures for machine-to-machine communications. In this paper, we focus on image compression and present an inference-time content-adaptive finetuning scheme that optimizes the latent representation of an end-to-end learned image codec, aimed at improving the compression efficiency for machine-consumption. The conducted experiments show that our online finetuning brings an average bitrate saving (BD-rate) of -3.66% with respect to our pretrained image codec. In particular, at low bitrate points, our proposed method results in a significant bitrate saving of -9.85%. Overall, our pretrained-and-then-finetuned system achieves -30.54% BD-rate over the state-of-the-art image/video codec Versatile Video Coding (VVC).

* 2021 IEEE International Conference on Multimedia and Expo (ICME), 2021, pp. 1-6
* Added some typo fixes since the accepted version in ICME2021

Via

Access Paper or Ask Questions