Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Pierre Hellier

D-Garment: Physics-Conditioned Latent Diffusion for Dynamic Garment Deformations

Apr 04, 2025

Antoine Dumoulin, Adnane Boukhayma, Laurence Boissieux, Bharath Bhushan Damodaran, Pierre Hellier, Stefanie Wuhrer

Abstract:Adjusting and deforming 3D garments to body shapes, body motion, and cloth material is an important problem in virtual and augmented reality. Applications are numerous, ranging from virtual change rooms to the entertainment and gaming industry. This problem is challenging as garment dynamics influence geometric details such as wrinkling patterns, which depend on physical input including the wearer's body shape and motion, as well as cloth material features. Existing work studies learning-based modeling techniques to generate garment deformations from example data, and physics-inspired simulators to generate realistic garment dynamics. We propose here a learning-based approach trained on data generated with a physics-based simulator. Compared to prior work, our 3D generative model learns garment deformations for loose cloth geometry, especially for large deformations and dynamic wrinkles driven by body motion and cloth material. Furthermore, the model can be efficiently fitted to observations captured using vision sensors. We propose to leverage the capability of diffusion models to learn fine-scale detail: we model the 3D garment in a 2D parameter space, and learn a latent diffusion model using this representation independent from the mesh resolution. This allows to condition global and local geometric information with body and material information. We quantitatively and qualitatively evaluate our method on both simulated data and data captured with a multi-view acquisition platform. Compared to strong baselines, our method is more accurate in terms of Chamfer distance.

* 11 pages, 7 figures

Via

Access Paper or Ask Questions

Exploiting Latent Properties to Optimize Neural Codecs

Jan 02, 2025

Muhammet Balcilar, Bharath Bhushan Damodaran, Karam Naser, Franck Galpin, Pierre Hellier

Figure 1 for Exploiting Latent Properties to Optimize Neural Codecs

Figure 2 for Exploiting Latent Properties to Optimize Neural Codecs

Figure 3 for Exploiting Latent Properties to Optimize Neural Codecs

Figure 4 for Exploiting Latent Properties to Optimize Neural Codecs

Abstract:End-to-end image and video codecs are becoming increasingly competitive, compared to traditional compression techniques that have been developed through decades of manual engineering efforts. These trainable codecs have many advantages over traditional techniques, such as their straightforward adaptation to perceptual distortion metrics and high performance in specific fields thanks to their learning ability. However, current state-of-the-art neural codecs do not fully exploit the benefits of vector quantization and the existence of the entropy gradient in decoding devices. In this paper, we propose to leverage these two properties (vector quantization and entropy gradient) to improve the performance of off-the-shelf codecs. Firstly, we demonstrate that using non-uniform scalar quantization cannot improve performance over uniform quantization. We thus suggest using predefined optimal uniform vector quantization to improve performance. Secondly, we show that the entropy gradient, available at the decoder, is correlated with the reconstruction error gradient, which is not available at the decoder. We therefore use the former as a proxy to enhance compression performance. Our experimental results show that these approaches save between 1 to 3% of the rate for the same quality across various pretrained methods. In addition, the entropy gradient based solution improves traditional codec performance significantly as well.

* Accepted in IEEE TRANSACTIONS ON IMAGE PROCESSING

Via

Access Paper or Ask Questions

Improved Positional Encoding for Implicit Neural Representation based Compact Data Representation

Nov 10, 2023

Bharath Bhushan Damodaran, Francois Schnitzler, Anne Lambert, Pierre Hellier

Abstract:Positional encodings are employed to capture the high frequency information of the encoded signals in implicit neural representation (INR). In this paper, we propose a novel positional encoding method which improves the reconstruction quality of the INR. The proposed embedding method is more advantageous for the compact data representation because it has a greater number of frequency basis than the existing methods. Our experiments shows that the proposed method achieves significant gain in the rate-distortion performance without introducing any additional complexity in the compression task and higher reconstruction quality in novel view synthesis.

* Published at ICCV 2023 Workshop on Neural Fields for Autonomous Driving and Robotics

Via

Access Paper or Ask Questions

Latent-Shift: Gradient of Entropy Helps Neural Codecs

Aug 01, 2023

Muhammet Balcilar, Bharath Bhushan Damodaran, Karam Naser, Franck Galpin, Pierre Hellier

Figure 1 for Latent-Shift: Gradient of Entropy Helps Neural Codecs

Figure 2 for Latent-Shift: Gradient of Entropy Helps Neural Codecs

Figure 3 for Latent-Shift: Gradient of Entropy Helps Neural Codecs

Abstract:End-to-end image/video codecs are getting competitive compared to traditional compression techniques that have been developed through decades of manual engineering efforts. These trainable codecs have many advantages over traditional techniques such as easy adaptation on perceptual distortion metrics and high performance on specific domains thanks to their learning ability. However, state of the art neural codecs does not take advantage of the existence of gradient of entropy in decoding device. In this paper, we theoretically show that gradient of entropy (available at decoder side) is correlated with the gradient of the reconstruction error (which is not available at decoder side). We then demonstrate experimentally that this gradient can be used on various compression methods, leading to a $1-2\%$ rate savings for the same quality. Our method is orthogonal to other improvements and brings independent rate savings.

* Published to ICIP2023, 6 pages, 1 figure

Via

Access Paper or Ask Questions

Entropy Coding Improvement for Low-complexity Compressive Auto-encoders

Mar 10, 2023

Franck Galpin, Muhammet Balcilar, Frédéric Lefebvre, Fabien Racapé, Pierre Hellier

Abstract:End-to-end image and video compression using auto-encoders (AE) offers new appealing perspectives in terms of rate-distortion gains and applications. While most complex models are on par with the latest compression standard like VVC/H.266 on objective metrics, practical implementation and complexity remain strong issues for real-world applications. In this paper, we propose a practical implementation suitable for realistic applications, leading to a low-complexity model. We demonstrate that some gains can be achieved on top of a state-of-the-art low-complexity AE, even when using simpler implementation. Improvements include off-training entropy coding improvement and encoder side Rate Distortion Optimized Quantization. Results show a 19% improvement in BDrate on basic implementation of fully-factorized model, and 15.3% improvement compared to the original implementation. The proposed implementation also allows a direct integration of such approaches on a variety of platforms.

* IEEE Data Compression Conference (DCC) 2023

Via

Access Paper or Ask Questions

RQAT-INR: Improved Implicit Neural Image Compression

Mar 06, 2023

Bharath Bhushan Damodaran, Muhammet Balcilar, Franck Galpin, Pierre Hellier

Figure 1 for RQAT-INR: Improved Implicit Neural Image Compression

Figure 2 for RQAT-INR: Improved Implicit Neural Image Compression

Figure 3 for RQAT-INR: Improved Implicit Neural Image Compression

Figure 4 for RQAT-INR: Improved Implicit Neural Image Compression

Abstract:Deep variational autoencoders for image and video compression have gained significant attraction in the recent years, due to their potential to offer competitive or better compression rates compared to the decades long traditional codecs such as AVC, HEVC or VVC. However, because of complexity and energy consumption, these approaches are still far away from practical usage in industry. More recently, implicit neural representation (INR) based codecs have emerged, and have lower complexity and energy usage to classical approaches at decoding. However, their performances are not in par at the moment with state-of-the-art methods. In this research, we first show that INR based image codec has a lower complexity than VAE based approaches, then we propose several improvements for INR-based image codec and outperformed baseline model by a large margin.

* Accepted as oral at Data compression conference 2023

Via

Access Paper or Ask Questions

Reducing The Mismatch Between Marginal and Learned Distributions in Neural Video Compression

Oct 12, 2022

Muhammet Balcilar, Bharath Bhushan Damodaran, Pierre Hellier

Figure 1 for Reducing The Mismatch Between Marginal and Learned Distributions in Neural Video Compression

Figure 2 for Reducing The Mismatch Between Marginal and Learned Distributions in Neural Video Compression

Abstract:During the last four years, we have witnessed the success of end-to-end trainable models for image compression. Compared to decades of incremental work, these machine learning (ML) techniques learn all the components of the compression technique, which explains their actual superiority. However, end-to-end ML models have not yet reached the performance of traditional video codecs such as VVC. Possible explanations can be put forward: lack of data to account for the temporal redundancy, or inefficiency of latent's density estimation in the neural model. The latter problem can be defined by the discrepancy between the latent's marginal distribution and the learned prior distribution. This mismatch, known as amortization gap of entropy model, enlarges the file size of compressed data. In this paper, we propose to evaluate the amortization gap for three state-of-the-art ML video compression methods. Second, we propose an efficient and generic method to solve the amortization gap and show that it leads to an improvement between $2\%$ to $5\%$ without impacting reconstruction quality.

* VCIP2022

Via

Access Paper or Ask Questions

Reducing The Amortization Gap of Entropy Bottleneck In End-to-End Image Compression

Sep 02, 2022

Muhammet Balcilar, Bharath Damodaran, Pierre Hellier

Figure 1 for Reducing The Amortization Gap of Entropy Bottleneck In End-to-End Image Compression

Figure 2 for Reducing The Amortization Gap of Entropy Bottleneck In End-to-End Image Compression

Figure 3 for Reducing The Amortization Gap of Entropy Bottleneck In End-to-End Image Compression

Figure 4 for Reducing The Amortization Gap of Entropy Bottleneck In End-to-End Image Compression

Abstract:End-to-end deep trainable models are about to exceed the performance of the traditional handcrafted compression techniques on videos and images. The core idea is to learn a non-linear transformation, modeled as a deep neural network, mapping input image into latent space, jointly with an entropy model of the latent distribution. The decoder is also learned as a deep trainable network, and the reconstructed image measures the distortion. These methods enforce the latent to follow some prior distributions. Since these priors are learned by optimization over the entire training set, the performance is optimal in average. However, it cannot fit exactly on every single new instance, hence damaging the compression performance by enlarging the bit-stream. In this paper, we propose a simple yet efficient instance-based parameterization method to reduce this amortization gap at a minor cost. The proposed method is applicable to any end-to-end compressing methods, improving the compression bitrate by 1% without any impact on the reconstruction quality.

* 5 pages, 3 figures,

Via

Access Paper or Ask Questions

Video Coding Using Learned Latent GAN Compression

Jul 12, 2022

Mustafa Shukor, Bharath Bhushan Damodaran, Xu Yao, Pierre Hellier

Figure 1 for Video Coding Using Learned Latent GAN Compression

Figure 2 for Video Coding Using Learned Latent GAN Compression

Figure 3 for Video Coding Using Learned Latent GAN Compression

Figure 4 for Video Coding Using Learned Latent GAN Compression

Abstract:We propose in this paper a new paradigm for facial video compression. We leverage the generative capacity of GANs such as StyleGAN to represent and compress a video, including intra and inter compression. Each frame is inverted in the latent space of StyleGAN, from which the optimal compression is learned. To do so, a diffeomorphic latent representation is learned using a normalizing flows model, where an entropy model can be optimized for image coding. In addition, we propose a new perceptual loss that is more efficient than other counterparts. Finally, an entropy model for video inter coding with residual is also learned in the previously constructed latent representation. Our method (SGANC) is simple, faster to train, and achieves better results for image and video coding compared to state-of-the-art codecs such as VTM, AV1, and recent deep learning techniques. In particular, it drastically minimizes perceptual distortion at low bit rates.

* Accepted at ACM Multimedia 2022

Via

Access Paper or Ask Questions

Semantic Unfolding of StyleGAN Latent Space

Jun 29, 2022

Mustafa Shukor, Xu Yao, Bharath Bushan Damodaran, Pierre Hellier

Figure 1 for Semantic Unfolding of StyleGAN Latent Space

Figure 2 for Semantic Unfolding of StyleGAN Latent Space

Figure 3 for Semantic Unfolding of StyleGAN Latent Space

Abstract:Generative adversarial networks (GANs) have proven to be surprisingly efficient for image editing by inverting and manipulating the latent code corresponding to an input real image. This editing property emerges from the disentangled nature of the latent space. In this paper, we identify that the facial attribute disentanglement is not optimal, thus facial editing relying on linear attribute separation is flawed. We thus propose to improve semantic disentanglement with supervision. Our method consists in learning a proxy latent representation using normalizing flows, and we show that this leads to a more efficient space for face image editing.

* Accepted at ICIP22

Via

Access Paper or Ask Questions