Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Saïd Ladjal

VibrantLeaves: A principled parametric image generator for training deep restoration models

Apr 14, 2025

Raphael Achddou, Yann Gousseau, Saïd Ladjal, Sabine Süsstrunk

Abstract:Even though Deep Neural Networks are extremely powerful for image restoration tasks, they have several limitations. They are poorly understood and suffer from strong biases inherited from the training sets. One way to address these shortcomings is to have a better control over the training sets, in particular by using synthetic sets. In this paper, we propose a synthetic image generator relying on a few simple principles. In particular, we focus on geometric modeling, textures, and a simple modeling of image acquisition. These properties, integrated in a classical Dead Leaves model, enable the creation of efficient training sets. Standard image denoising and super-resolution networks can be trained on such datasets, reaching performance almost on par with training on natural image datasets. As a first step towards explainability, we provide a careful analysis of the considered principles, identifying which image properties are necessary to obtain good performances. Besides, such training also yields better robustness to various geometric and radiometric perturbations of the test sets.

Via

Access Paper or Ask Questions

A Compact and Semantic Latent Space for Disentangled and Controllable Image Editing

Dec 13, 2023

Gwilherm Lesné, Yann Gousseau, Saïd Ladjal, Alasdair Newson

Abstract:Recent advances in the field of generative models and in particular generative adversarial networks (GANs) have lead to substantial progress for controlled image editing, especially compared with the pre-deep learning era. Despite their powerful ability to apply realistic modifications to an image, these methods often lack properties like disentanglement (the capacity to edit attributes independently). In this paper, we propose an auto-encoder which re-organizes the latent space of StyleGAN, so that each attribute which we wish to edit corresponds to an axis of the new latent space, and furthermore that the latent axes are decorrelated, encouraging disentanglement. We work in a compressed version of the latent space, using Principal Component Analysis, meaning that the parameter complexity of our autoencoder is reduced, leading to short training times ($\sim$ 45 mins). Qualitative and quantitative results demonstrate the editing capabilities of our approach, with greater disentanglement than competing methods, while maintaining fidelity to the original image with respect to identity. Our autoencoder architecture simple and straightforward, facilitating implementation.

Via

Access Paper or Ask Questions

An analysis of the transfer learning of convolutional neural networks for artistic images

Nov 05, 2020

Nicolas Gonthier, Yann Gousseau, Saïd Ladjal

Figure 1 for An analysis of the transfer learning of convolutional neural networks for artistic images

Figure 2 for An analysis of the transfer learning of convolutional neural networks for artistic images

Figure 3 for An analysis of the transfer learning of convolutional neural networks for artistic images

Figure 4 for An analysis of the transfer learning of convolutional neural networks for artistic images

Abstract:Transfer learning from huge natural image datasets, fine-tuning of deep neural networks and the use of the corresponding pre-trained networks have become de facto the core of art analysis applications. Nevertheless, the effects of transfer learning are still poorly understood. In this paper, we first use techniques for visualizing the network internal representations in order to provide clues to the understanding of what the network has learned on artistic images. Then, we provide a quantitative analysis of the changes introduced by the learning process thanks to metrics in both the feature and parameter spaces, as well as metrics computed on the set of maximal activation images. These analyses are performed on several variations of the transfer learning procedure. In particular, we observed that the network could specialize some pre-trained filters to the new image modality and also that higher layers tend to concentrate classes. Finally, we have shown that a double fine-tuning involving a medium-size artistic dataset can improve the classification on smaller datasets, even when the task changes.

Via

Access Paper or Ask Questions

Improving Interpretability for Computer-aided Diagnosis tools on Whole Slide Imaging with Multiple Instance Learning and Gradient-based Explanations

Sep 29, 2020

Antoine Pirovano, Hippolyte Heuberger, Sylvain Berlemont, Saïd Ladjal, Isabelle Bloch

Figure 1 for Improving Interpretability for Computer-aided Diagnosis tools on Whole Slide Imaging with Multiple Instance Learning and Gradient-based Explanations

Figure 2 for Improving Interpretability for Computer-aided Diagnosis tools on Whole Slide Imaging with Multiple Instance Learning and Gradient-based Explanations

Figure 3 for Improving Interpretability for Computer-aided Diagnosis tools on Whole Slide Imaging with Multiple Instance Learning and Gradient-based Explanations

Figure 4 for Improving Interpretability for Computer-aided Diagnosis tools on Whole Slide Imaging with Multiple Instance Learning and Gradient-based Explanations

Abstract:Deep learning methods are widely used for medical applications to assist medical doctors in their daily routines. While performances reach expert's level, interpretability (highlight how and what a trained model learned and why it makes a specific decision) is the next important challenge that deep learning methods need to answer to be fully integrated in the medical field. In this paper, we address the question of interpretability in the context of whole slide images (WSI) classification. We formalize the design of WSI classification architectures and propose a piece-wise interpretability approach, relying on gradient-based methods, feature visualization and multiple instance learning context. We aim at explaining how the decision is made based on tile level scoring, how these tile scores are decided and which features are used and relevant for the task. After training two WSI classification architectures on Camelyon-16 WSI dataset, highlighting discriminative features learned, and validating our approach with pathologists, we propose a novel manner of computing interpretability slide-level heat-maps, based on the extracted features, that improves tile-level classification performances by more than 29% for AUC.

* 8 pages (references excluded), 3 figures, presented in iMIMIC Workshop at MICCAI 2020

Via

Access Paper or Ask Questions

Multiple instance learning on deep features for weakly supervised object detection with extreme domain shifts

Sep 12, 2020

Nicolas Gonthier, Saïd Ladjal, Yann Gousseau

Figure 1 for Multiple instance learning on deep features for weakly supervised object detection with extreme domain shifts

Figure 2 for Multiple instance learning on deep features for weakly supervised object detection with extreme domain shifts

Figure 3 for Multiple instance learning on deep features for weakly supervised object detection with extreme domain shifts

Figure 4 for Multiple instance learning on deep features for weakly supervised object detection with extreme domain shifts

Abstract:Weakly supervised object detection (WSOD) using only image-level annotations has attracted a growing attention over the past few years. Whereas such task is typically addressed with a domain-specific solution focused on natural images, we show that a simple multiple instance approach applied on pre-trained deep features yields excellent performances on non-photographic datasets, possibly including new classes. The approach does not include any fine-tuning or cross-domain learning and is therefore efficient and possibly applicable to arbitrary datasets and classes. We investigate several flavors of the proposed approach, some including multi-layers perceptron and polyhedral classifiers. Despite its simplicity, our method shows competitive results on a range of publicly available datasets, including paintings (People-Art, IconArt), watercolors, cliparts and comics and allows to quickly learn unseen visual categories.

* 31 pages, 11 figures

Via

Access Paper or Ask Questions

High resolution neural texture synthesis with long range constraints

Aug 04, 2020

Nicolas Gonthier, Yann Gousseau, Saïd Ladjal

Figure 1 for High resolution neural texture synthesis with long range constraints

Figure 2 for High resolution neural texture synthesis with long range constraints

Figure 3 for High resolution neural texture synthesis with long range constraints

Figure 4 for High resolution neural texture synthesis with long range constraints

Abstract:The field of texture synthesis has witnessed important progresses over the last years, most notably through the use of Convolutional Neural Networks. However, neural synthesis methods still struggle to reproduce large scale structures, especially with high resolution textures. To address this issue, we first introduce a simple multi-resolution framework that efficiently accounts for long-range dependency. Then, we show that additional statistical constraints further improve the reproduction of textures with strong regularity. This can be achieved by constraining both the Gram matrices of a neural network and the power spectrum of the image. Alternatively one may constrain only the autocorrelation of the features of the network and drop the Gram matrices constraints. In an experimental part, the proposed methods are then extensively tested and compared to alternative approaches, both in an unsupervised way and through a user study. Experiments show the interest of the multi-scale scheme for high resolution textures and the interest of combining it with additional constraints for regular textures.

* 25 pages, 18 figures. LOW RESOLUTION PDF: Images may show compression artifacts

Via

Access Paper or Ask Questions

PCAAE: Principal Component Analysis Autoencoder for organising the latent space of generative networks

Jun 14, 2020

Chi-Hieu Pham, Saïd Ladjal, Alasdair Newson

Figure 1 for PCAAE: Principal Component Analysis Autoencoder for organising the latent space of generative networks

Figure 2 for PCAAE: Principal Component Analysis Autoencoder for organising the latent space of generative networks

Figure 3 for PCAAE: Principal Component Analysis Autoencoder for organising the latent space of generative networks

Figure 4 for PCAAE: Principal Component Analysis Autoencoder for organising the latent space of generative networks

Abstract:Autoencoders and generative models produce some of the most spectacular deep learning results to date. However, understanding and controlling the latent space of these models presents a considerable challenge. Drawing inspiration from principal component analysis and autoencoder, we propose the Principal Component Analysis Autoencoder (PCAAE). This is a novel autoencoder whose latent space verifies two properties. Firstly, the dimensions are organised in decreasing importance with respect to the data at hand. Secondly, the components of the latent space are statistically independent. We achieve this by progressively increasing the latent space during training, and with a covariance loss applied to the latent codes. The resulting autoencoder produces a latent space which separates the intrinsic attributes of the data into different components of the latent space, in a completely unsupervised manner. We also describe an extension of our approach to the case of powerful, pre-trained GANs. We show results on both synthetic examples of shapes and on a state-of-the-art GAN. For example, we are able to separate the color shade scale of hair and skin, pose of faces and the gender in the CelebA, without accessing any labels. We compare the PCAAE with other state-of-the-art approaches, in particular with respect to the ability to disentangle attributes in the latent space. We hope that this approach will contribute to better understanding of the intrinsic latent spaces of powerful deep generative models.

* Preprint with Appendix

Via

Access Paper or Ask Questions

Processsing Simple Geometric Attributes with Autoencoders

Apr 15, 2019

Alasdair Newson, Andrés Almansa, Yann Gousseau, Saïd Ladjal

Figure 1 for Processsing Simple Geometric Attributes with Autoencoders

Figure 2 for Processsing Simple Geometric Attributes with Autoencoders

Figure 3 for Processsing Simple Geometric Attributes with Autoencoders

Figure 4 for Processsing Simple Geometric Attributes with Autoencoders

Abstract:Image synthesis is a core problem in modern deep learning, and many recent architectures such as autoencoders and Generative Adversarial networks produce spectacular results on highly complex data, such as images of faces or landscapes. While these results open up a wide range of new, advanced synthesis applications, there is also a severe lack of theoretical understanding of how these networks work. This results in a wide range of practical problems, such as difficulties in training, the tendency to sample images with little or no variability, and generalisation problems. In this paper, we propose to analyse the ability of the simplest generative network, the autoencoder, to encode and decode two simple geometric attributes : size and position. We believe that, in order to understand more complicated tasks, it is necessary to first understand how these networks process simple attributes. For the first property, we analyse the case of images of centred disks with variable radii. We explain how the autoencoder projects these images to and from a latent space of smallest possible dimension, a scalar. In particular, we describe a closed-form solution to the decoding training problem in a network without biases, and show that during training, the network indeed finds this solution. We then investigate the best regularisation approaches which yield networks that generalise well. For the second property, position, we look at the encoding and decoding of Dirac delta functions, also known as `one-hot' vectors. We describe a hand-crafted filter that achieves encoding perfectly, and show that the network naturally finds this filter during training. We also show experimentally that the decoding can be achieved if the dataset is sampled in an appropriate manner.

Via

Access Paper or Ask Questions

A PCA-like Autoencoder

Apr 02, 2019

Saïd Ladjal, Alasdair Newson, Chi-Hieu Pham

Abstract:An autoencoder is a neural network which data projects to and from a lower dimensional latent space, where this data is easier to understand and model. The autoencoder consists of two sub-networks, the encoder and the decoder, which carry out these transformations. The neural network is trained such that the output is as close to the input as possible, the data having gone through an information bottleneck : the latent space. This tool bears significant ressemblance to Principal Component Analysis (PCA), with two main differences. Firstly, the autoencoder is a non-linear transformation, contrary to PCA, which makes the autoencoder more flexible and powerful. Secondly, the axes found by a PCA are orthogonal, and are ordered in terms of the amount of variability which the data presents along these axes. This makes the interpretability of the PCA much greater than that of the autoencoder, which does not have these attributes. Ideally, then, we would like an autoencoder whose latent space consists of independent components, ordered by decreasing importance to the data. In this paper, we propose an algorithm to create such a network. We create an iterative algorithm which progressively increases the size of the latent space, learning a new dimension at each step. Secondly, we propose a covariance loss term to add to the standard autoencoder loss function, as well as a normalisation layer just before the latent space, which encourages the latent space components to be statistically independent. We demonstrate the results of this autoencoder on simple geometric shapes, and find that the algorithm indeed finds a meaningful representation in the latent space. This means that subsequent interpolation in the latent space has meaning with respect to the geometric properties of the images.

Via

Access Paper or Ask Questions