Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Thomas Oberlin

Style Transfer with Diffusion Models for Synthetic-to-Real Domain Adaptation

May 22, 2025

Estelle Chigot, Dennis G. Wilson, Meriem Ghrib, Thomas Oberlin

Abstract:Semantic segmentation models trained on synthetic data often perform poorly on real-world images due to domain gaps, particularly in adverse conditions where labeled data is scarce. Yet, recent foundation models enable to generate realistic images without any training. This paper proposes to leverage such diffusion models to improve the performance of vision models when learned on synthetic data. We introduce two novel techniques for semantically consistent style transfer using diffusion models: Class-wise Adaptive Instance Normalization and Cross-Attention (CACTI) and its extension with selective attention Filtering (CACTIF). CACTI applies statistical normalization selectively based on semantic classes, while CACTIF further filters cross-attention maps based on feature similarity, preventing artifacts in regions with weak cross-attention correspondences. Our methods transfer style characteristics while preserving semantic boundaries and structural coherence, unlike approaches that apply global transformations or generate content without constraints. Experiments using GTA5 as source and Cityscapes/ACDC as target domains show that our approach produces higher quality images with lower FID scores and better content preservation. Our work demonstrates that class-aware diffusion-based style transfer effectively bridges the synthetic-to-real domain gap even with minimal target domain data, advancing robust perception systems for challenging real-world applications. The source code is available at: https://github.com/echigot/cactif.

* Under review

Via

Access Paper or Ask Questions

PG-DPIR: An efficient plug-and-play method for high-count Poisson-Gaussian inverse problems

Apr 14, 2025

Maud Biquard, Marie Chabert, Florence Genin, Christophe Latry, Thomas Oberlin

Abstract:Poisson-Gaussian noise describes the noise of various imaging systems thus the need of efficient algorithms for Poisson-Gaussian image restoration. Deep learning methods offer state-of-the-art performance but often require sensor-specific training when used in a supervised setting. A promising alternative is given by plug-and-play (PnP) methods, which consist in learning only a regularization through a denoiser, allowing to restore images from several sources with the same network. This paper introduces PG-DPIR, an efficient PnP method for high-count Poisson-Gaussian inverse problems, adapted from DPIR. While DPIR is designed for white Gaussian noise, a naive adaptation to Poisson-Gaussian noise leads to prohibitively slow algorithms due to the absence of a closed-form proximal operator. To address this, we adapt DPIR for the specificities of Poisson-Gaussian noise and propose in particular an efficient initialization of the gradient descent required for the proximal step that accelerates convergence by several orders of magnitude. Experiments are conducted on satellite image restoration and super-resolution problems. High-resolution realistic Pleiades images are simulated for the experiments, which demonstrate that PG-DPIR achieves state-of-the-art performance with improved efficiency, which seems promising for on-ground satellite processing chains.

Via

Access Paper or Ask Questions

Deep priors for satellite image restoration with accurate uncertainties

Dec 05, 2024

Biquard Maud, Marie Chabert, Florence Genin, Christophe Latry, Thomas Oberlin

Figure 1 for Deep priors for satellite image restoration with accurate uncertainties

Figure 2 for Deep priors for satellite image restoration with accurate uncertainties

Figure 3 for Deep priors for satellite image restoration with accurate uncertainties

Figure 4 for Deep priors for satellite image restoration with accurate uncertainties

Abstract:Satellite optical images, upon their on-ground receipt, offer a distorted view of the observed scene. Their restoration, classically including denoising, deblurring, and sometimes super-resolution, is required before their exploitation. Moreover, quantifying the uncertainty related to this restoration could be valuable by lowering the risk of hallucination and avoiding propagating these biases in downstream applications. Deep learning methods are now state-of-the-art for satellite image restoration. However, they require to train a specific network for each sensor and they do not provide the associated uncertainties. This paper proposes a generic method involving a single network to restore images from several sensors and a scalable way to derive the uncertainties. We focus on deep regularization (DR) methods, which learn a deep prior on target images before plugging it into a model-based optimization scheme. First, we introduce VBLE-xz, which solves the inverse problem in the latent space of a variational compressive autoencoder, estimating the uncertainty jointly in the latent and in the image spaces. It enables scalable posterior sampling with relevant and calibrated uncertainties. Second, we propose the denoiser-based method SatDPIR, adapted from DPIR, which efficiently computes accurate point estimates. We conduct a comprehensive set of experiments on very high resolution simulated and real Pleiades images, asserting both the performance and robustness of the proposed methods. VBLE-xz and SatDPIR achieve state-of-the-art results compared to direct inversion methods. In particular, VBLE-xz is a scalable method to get realistic posterior samples and accurate uncertainties, while SatDPIR represents a compelling alternative to direct inversion methods when uncertainty quantification is not required.

Via

Access Paper or Ask Questions

On-the-fly spectral unmixing based on Kalman filtering

Jul 22, 2024

Hugues Kouakou, José Henrique de Morais Goulart, Raffaele Vitale, Thomas Oberlin, David Rousseau, Cyril Ruckebusch, Nicolas Dobigeon

Figure 1 for On-the-fly spectral unmixing based on Kalman filtering

Figure 2 for On-the-fly spectral unmixing based on Kalman filtering

Figure 3 for On-the-fly spectral unmixing based on Kalman filtering

Figure 4 for On-the-fly spectral unmixing based on Kalman filtering

Abstract:This work introduces an on-the-fly (i.e., online) linear unmixing method which is able to sequentially analyze spectral data acquired on a spectrum-by-spectrum basis. After deriving a sequential counterpart of the conventional linear mixing model, the proposed approach recasts the linear unmixing problem into a linear state-space estimation framework. Under Gaussian noise and state models, the estimation of the pure spectra can be efficiently conducted by resorting to Kalman filtering. Interestingly, it is shown that this Kalman filter can operate in a lower-dimensional subspace while ensuring the nonnegativity constraint inherent to pure spectra. This dimensionality reduction allows significantly lightening the computational burden, while leveraging recent advances related to the representation of essential spectral information. The proposed method is evaluated through extensive numerical experiments conducted on synthetic and real Raman data sets. The results show that this Kalman filter-based method offers a convenient trade-off between unmixing accuracy and computational efficiency, which is crucial for operating in an on-the-fly setting. To the best of the authors' knowledge, this is the first operational method which is able to solve the spectral unmixing problem efficiently in a dynamic fashion. It also constitutes a valuable building block for benefiting from acquisition and processing frameworks recently proposed in the microscopy literature, which are motivated by practical issues such as reducing acquisition time and avoiding potential damages being inflicted to photosensitive samples.

Via

Access Paper or Ask Questions

Leveraging Self-Supervised Instance Contrastive Learning for Radar Object Detection

Feb 13, 2024

Colin Decourt, Rufin VanRullen, Didier Salle, Thomas Oberlin

Figure 1 for Leveraging Self-Supervised Instance Contrastive Learning for Radar Object Detection

Figure 2 for Leveraging Self-Supervised Instance Contrastive Learning for Radar Object Detection

Figure 3 for Leveraging Self-Supervised Instance Contrastive Learning for Radar Object Detection

Figure 4 for Leveraging Self-Supervised Instance Contrastive Learning for Radar Object Detection

Abstract:In recent years, driven by the need for safer and more autonomous transport systems, the automotive industry has shifted toward integrating a growing number of Advanced Driver Assistance Systems (ADAS). Among the array of sensors employed for object recognition tasks, radar sensors have emerged as a formidable contender due to their abilities in adverse weather conditions or low-light scenarios and their robustness in maintaining consistent performance across diverse environments. However, the small size of radar datasets and the complexity of the labelling of those data limit the performance of radar object detectors. Driven by the promising results of self-supervised learning in computer vision, this paper presents RiCL, an instance contrastive learning framework to pre-train radar object detectors. We propose to exploit the detection from the radar and the temporal information to pre-train the radar object detection model in a self-supervised way using contrastive learning. We aim to pre-train an object detector's backbone, head and neck to learn with fewer data. Experiments on the CARRADA and the RADDet datasets show the effectiveness of our approach in learning generic representations of objects in range-Doppler maps. Notably, our pre-training strategy allows us to use only 20% of the labelled data to reach a similar mAP@0.5 than a supervised approach using the whole training set.

* 8 pages, 3 figures, 1 table

Via

Access Paper or Ask Questions

Variational Bayes image restoration with compressive autoencoders

Nov 29, 2023

Maud Biquard, Marie Chabert, Thomas Oberlin

Abstract:Regularization of inverse problems is of paramount importance in computational imaging. The ability of neural networks to learn efficient image representations has been recently exploited to design powerful data-driven regularizers. While state-of-the-art plug-and-play methods rely on an implicit regularization provided by neural denoisers, alternative Bayesian approaches consider Maximum A Posteriori (MAP) estimation in the latent space of a generative model, thus with an explicit regularization. However, state-of-the-art deep generative models require a huge amount of training data compared to denoisers. Besides, their complexity hampers the optimization of the latent MAP. In this work, we propose to use compressive autoencoders for latent estimation. These networks, which can be seen as variational autoencoders with a flexible latent prior, are smaller and easier to train than state-of-the-art generative models. We then introduce the Variational Bayes Latent Estimation (VBLE) algorithm, which performs this estimation within the framework of variational inference. This allows for fast and easy (approximate) posterior sampling. Experimental results on image datasets BSD and FFHQ demonstrate that VBLE reaches similar performance than state-of-the-art plug-and-play methods, while being able to quantify uncertainties faster than other existing posterior sampling techniques.

Via

Access Paper or Ask Questions

A recurrent CNN for online object detection on raw radar frames

Dec 21, 2022

Colin Decourt, Rufin VanRullen, Didier Salle, Thomas Oberlin

Abstract:Automotive radar sensors provide valuable information for advanced driving assistance systems (ADAS). Radars can reliably estimate the distance to an object and the relative velocity, regardless of weather and light conditions. However, radar sensors suffer from low resolution and huge intra-class variations in the shape of objects. Exploiting the time information (e.g., multiple frames) has been shown to help to capture better the dynamics of objects and, therefore, the variation in the shape of objects. Most temporal radar object detectors use 3D convolutions to learn spatial and temporal information. However, these methods are often non-causal and unsuitable for real-time applications. This work presents RECORD, a new recurrent CNN architecture for online radar object detection. We propose an end-to-end trainable architecture mixing convolutions and ConvLSTMs to learn spatio-temporal dependencies between successive frames. Our model is causal and requires only the past information encoded in the memory of the ConvLSTMs to detect objects. Our experiments show such a method's relevance for detecting objects in different radar representations (range-Doppler, range-angle) and outperform state-of-the-art models on the ROD2021 and CARRADA datasets while being less computationally expensive. The code will be available soon.

* 13 pages, 3 figures

Via

Access Paper or Ask Questions

Algorithms for audio inpainting based on probabilistic nonnegative matrix factorization

Jun 28, 2022

Ondřej Mokrý, Paul Magron, Thomas Oberlin, Cédric Févotte

Figure 1 for Algorithms for audio inpainting based on probabilistic nonnegative matrix factorization

Figure 2 for Algorithms for audio inpainting based on probabilistic nonnegative matrix factorization

Figure 3 for Algorithms for audio inpainting based on probabilistic nonnegative matrix factorization

Abstract:Audio inpainting, i.e., the task of restoring missing or occluded audio signal samples, usually relies on sparse representations or autoregressive modeling. In this paper, we propose to structure the spectrogram with nonnegative matrix factorization (NMF) in a probabilistic framework. First, we treat the missing samples as latent variables, and derive two expectation-maximization algorithms for estimating the parameters of the model, depending on whether we formulate the problem in the time- or time-frequency domain. Then, we treat the missing samples as parameters, and we address this novel problem by deriving an alternating minimization scheme. We assess the potential of these algorithms for the task of restoring short- to middle-length gaps in music signals. Experiments reveal great convergence properties of the proposed methods, as well as competitive performance when compared to state-of-the-art audio inpainting techniques.

Via

Access Paper or Ask Questions

Learning the Proximity Operator in Unfolded ADMM for Phase Retrieval

Apr 04, 2022

Pierre-Hugo Vial, Paul Magron, Thomas Oberlin, Cédric Févotte

Figure 1 for Learning the Proximity Operator in Unfolded ADMM for Phase Retrieval

Figure 2 for Learning the Proximity Operator in Unfolded ADMM for Phase Retrieval

Figure 3 for Learning the Proximity Operator in Unfolded ADMM for Phase Retrieval

Figure 4 for Learning the Proximity Operator in Unfolded ADMM for Phase Retrieval

Abstract:This paper considers the phase retrieval (PR) problem, which aims to reconstruct a signal from phaseless measurements such as magnitude or power spectrograms. PR is generally handled as a minimization problem involving a quadratic loss. Recent works have considered alternative discrepancy measures, such as the Bregman divergences, but it is still challenging to tailor the optimal loss for a given setting. In this paper we propose a novel strategy to automatically learn the optimal metric for PR. We unfold a recently introduced ADMM algorithm into a neural network, and we emphasize that the information about the loss used to formulate the PR problem is conveyed by the proximity operator involved in the ADMM updates. Therefore, we replace this proximity operator with trainable activation functions: learning these in a supervised setting is then equivalent to learning an optimal metric for PR. Experiments conducted with speech signals show that our approach outperforms the baseline ADMM, using a light and interpretable neural architecture.

* 10 pages, 5 figures, submitted to IEEE SPL

Via

Access Paper or Ask Questions

Regularization via deep generative models: an analysis point of view

Jan 21, 2021

Thomas Oberlin, Mathieu Verm

Figure 1 for Regularization via deep generative models: an analysis point of view

Figure 2 for Regularization via deep generative models: an analysis point of view

Figure 3 for Regularization via deep generative models: an analysis point of view

Abstract:This paper proposes a new way of regularizing an inverse problem in imaging (e.g., deblurring or inpainting) by means of a deep generative neural network. Compared to end-to-end models, such approaches seem particularly interesting since the same network can be used for many different problems and experimental conditions, as soon as the generative model is suited to the data. Previous works proposed to use a synthesis framework, where the estimation is performed on the latent vector, the solution being obtained afterwards via the decoder. Instead, we propose an analysis formulation where we directly optimize the image itself and penalize the latent vector. We illustrate the interest of such a formulation by running experiments of inpainting, deblurring and super-resolution. In many cases our technique achieves a clear improvement of the performance and seems to be more robust, in particular with respect to initialization.

Via

Access Paper or Ask Questions