Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dennis G. Wilson

Style Transfer with Diffusion Models for Synthetic-to-Real Domain Adaptation

May 22, 2025

Estelle Chigot, Dennis G. Wilson, Meriem Ghrib, Thomas Oberlin

Abstract:Semantic segmentation models trained on synthetic data often perform poorly on real-world images due to domain gaps, particularly in adverse conditions where labeled data is scarce. Yet, recent foundation models enable to generate realistic images without any training. This paper proposes to leverage such diffusion models to improve the performance of vision models when learned on synthetic data. We introduce two novel techniques for semantically consistent style transfer using diffusion models: Class-wise Adaptive Instance Normalization and Cross-Attention (CACTI) and its extension with selective attention Filtering (CACTIF). CACTI applies statistical normalization selectively based on semantic classes, while CACTIF further filters cross-attention maps based on feature similarity, preventing artifacts in regions with weak cross-attention correspondences. Our methods transfer style characteristics while preserving semantic boundaries and structural coherence, unlike approaches that apply global transformations or generate content without constraints. Experiments using GTA5 as source and Cityscapes/ACDC as target domains show that our approach produces higher quality images with lower FID scores and better content preservation. Our work demonstrates that class-aware diffusion-based style transfer effectively bridges the synthetic-to-real domain gap even with minimal target domain data, advancing robust perception systems for challenging real-world applications. The source code is available at: https://github.com/echigot/cactif.

* Under review

Via

Access Paper or Ask Questions

Exploration by Running Away from the Past

Nov 21, 2024

Paul-Antoine Le Tolguenec, Yann Besse, Florent Teichteil-Koenigsbuch, Dennis G. Wilson, Emmanuel Rachelson

Abstract:The ability to explore efficiently and effectively is a central challenge of reinforcement learning. In this work, we consider exploration through the lens of information theory. Specifically, we cast exploration as a problem of maximizing the Shannon entropy of the state occupation measure. This is done by maximizing a sequence of divergences between distributions representing an agent's past behavior and its current behavior. Intuitively, this encourages the agent to explore new behaviors that are distinct from past behaviors. Hence, we call our method RAMP, for ``$\textbf{R}$unning $\textbf{A}$way fro$\textbf{m}$ the $\textbf{P}$ast.'' A fundamental question of this method is the quantification of the distribution change over time. We consider both the Kullback-Leibler divergence and the Wasserstein distance to quantify divergence between successive state occupation measures, and explain why the former might lead to undesirable exploratory behaviors in some tasks. We demonstrate that by encouraging the agent to explore by actively distancing itself from past experiences, it can effectively explore mazes and a wide range of behaviors on robotic manipulation and locomotion tasks.

Via

Access Paper or Ask Questions

Exploration by Learning Diverse Skills through Successor State Measures

Jun 14, 2024

Paul-Antoine Le Tolguenec, Yann Besse, Florent Teichteil-Konigsbuch, Dennis G. Wilson, Emmanuel Rachelson

Abstract:The ability to perform different skills can encourage agents to explore. In this work, we aim to construct a set of diverse skills which uniformly cover the state space. We propose a formalization of this search for diverse skills, building on a previous definition based on the mutual information between states and skills. We consider the distribution of states reached by a policy conditioned on each skill and leverage the successor state measure to maximize the difference between these skill distributions. We call this approach LEADS: Learning Diverse Skills through Successor States. We demonstrate our approach on a set of maze navigation and robotic control tasks which show that our method is capable of constructing a diverse set of skills which exhaustively cover the state space without relying on reward or exploration bonuses. Our findings demonstrate that this new formalization promotes more robust and efficient exploration by combining mutual information maximization and exploration bonuses.

Via

Access Paper or Ask Questions

Quality with Just Enough Diversity in Evolutionary Policy Search

May 07, 2024

Paul Templier, Luca Grillotti, Emmanuel Rachelson, Dennis G. Wilson, Antoine Cully

Abstract:Evolution Strategies (ES) are effective gradient-free optimization methods that can be competitive with gradient-based approaches for policy search. ES only rely on the total episodic scores of solutions in their population, from which they estimate fitness gradients for their update with no access to true gradient information. However this makes them sensitive to deceptive fitness landscapes, and they tend to only explore one way to solve a problem. Quality-Diversity methods such as MAP-Elites introduced additional information with behavior descriptors (BD) to return a population of diverse solutions, which helps exploration but leads to a large part of the evaluation budget not being focused on finding the best performing solution. Here we show that behavior information can also be leveraged to find the best policy by identifying promising search areas which can then be efficiently explored with ES. We introduce the framework of Quality with Just Enough Diversity (JEDi) which learns the relationship between behavior and fitness to focus evaluations on solutions that matter. When trying to reach higher fitness values, JEDi outperforms both QD and ES methods on hard exploration tasks like mazes and on complex control problems with large policies.

Via

Access Paper or Ask Questions

Genetic Drift Regularization: on preventing Actor Injection from breaking Evolution Strategies

May 07, 2024

Paul Templier, Emmanuel Rachelson, Antoine Cully, Dennis G. Wilson

Abstract:Evolutionary Algorithms (EA) have been successfully used for the optimization of neural networks for policy search, but they still remain sample inefficient and underperforming in some cases compared to gradient-based reinforcement learning (RL). Various methods combine the two approaches, many of them training a RL algorithm on data from EA evaluations and injecting the RL actor into the EA population. However, when using Evolution Strategies (ES) as the EA, the RL actor can drift genetically far from the the ES distribution and injection can cause a collapse of the ES performance. Here, we highlight the phenomenon of genetic drift where the actor genome and the ES population distribution progressively drift apart, leading to injection having a negative impact on the ES. We introduce Genetic Drift Regularization (GDR), a simple regularization method in the actor training loss that prevents the actor genome from drifting away from the ES. We show that GDR can improve ES convergence on problems where RL learns well, but also helps RL training on other tasks, , fixes the injection issues better than previous controlled injection methods.

Via

Access Paper or Ask Questions

Kartezio: Evolutionary Design of Explainable Pipelines for Biomedical Image Analysis

Feb 28, 2023

Kévin Cortacero, Brienne McKenzie, Sabina Müller, Roxana Khazen, Fanny Lafouresse, Gaëlle Corsaut, Nathalie Van Acker, François-Xavier Frenois, Laurence Lamant, Nicolas Meyer(+7 more)

Abstract:An unresolved issue in contemporary biomedicine is the overwhelming number and diversity of complex images that require annotation, analysis and interpretation. Recent advances in Deep Learning have revolutionized the field of computer vision, creating algorithms that compete with human experts in image segmentation tasks. Crucially however, these frameworks require large human-annotated datasets for training and the resulting models are difficult to interpret. In this study, we introduce Kartezio, a modular Cartesian Genetic Programming based computational strategy that generates transparent and easily interpretable image processing pipelines by iteratively assembling and parameterizing computer vision functions. The pipelines thus generated exhibit comparable precision to state-of-the-art Deep Learning approaches on instance segmentation tasks, while requiring drastically smaller training datasets, a feature which confers tremendous flexibility, speed, and functionality to this approach. We also deployed Kartezio to solve semantic and instance segmentation problems in four real-world Use Cases, and showcase its utility in imaging contexts ranging from high-resolution microscopy to clinical pathology. By successfully implementing Kartezio on a portfolio of images ranging from subcellular structures to tumoral tissue, we demonstrated the flexibility, robustness and practical utility of this fully explicable evolutionary designer for semantic and instance segmentation.

* 42 pages, 6 main Figures, 3 Extended Data Figures, 5 Extended Data Tables, 1 Extended Data Movie. The Extended Data Movie is available at the following link: https://drive.google.com/file/d/1eNGhFC8gyu5xjVOhIZve894g3bBKXEgs/view?usp=sharing

Via

Access Paper or Ask Questions

Curiosity creates Diversity in Policy Search

Dec 07, 2022

Paul-Antoine Le Tolguenec, Emmanuel Rachelson, Yann Besse, Dennis G. Wilson

Figure 1 for Curiosity creates Diversity in Policy Search

Figure 2 for Curiosity creates Diversity in Policy Search

Figure 3 for Curiosity creates Diversity in Policy Search

Figure 4 for Curiosity creates Diversity in Policy Search

Abstract:When searching for policies, reward-sparse environments often lack sufficient information about which behaviors to improve upon or avoid. In such environments, the policy search process is bound to blindly search for reward-yielding transitions and no early reward can bias this search in one direction or another. A way to overcome this is to use intrinsic motivation in order to explore new transitions until a reward is found. In this work, we use a recently proposed definition of intrinsic motivation, Curiosity, in an evolutionary policy search method. We propose Curiosity-ES, an evolutionary strategy adapted to use Curiosity as a fitness metric. We compare Curiosity with Novelty, a commonly used diversity metric, and find that Curiosity can generate higher diversity over full episodes without the need for an explicit diversity criterion and lead to multiple policies which find reward.

Via

Access Paper or Ask Questions

Architectural Optimization over Subgroups for Equivariant Neural Networks

Oct 11, 2022

Kaitlin Maile, Dennis G. Wilson, Patrick Forré

Figure 1 for Architectural Optimization over Subgroups for Equivariant Neural Networks

Figure 2 for Architectural Optimization over Subgroups for Equivariant Neural Networks

Figure 3 for Architectural Optimization over Subgroups for Equivariant Neural Networks

Figure 4 for Architectural Optimization over Subgroups for Equivariant Neural Networks

Abstract:Incorporating equivariance to symmetry groups as a constraint during neural network training can improve performance and generalization for tasks exhibiting those symmetries, but such symmetries are often not perfectly nor explicitly present. This motivates algorithmically optimizing the architectural constraints imposed by equivariance. We propose the equivariance relaxation morphism, which preserves functionality while reparameterizing a group equivariant layer to operate with equivariance constraints on a subgroup, as well as the $[G]$-mixed equivariant layer, which mixes layers constrained to different groups to enable within-layer equivariance optimization. We further present evolutionary and differentiable neural architecture search (NAS) algorithms that utilize these mechanisms respectively for equivariance-aware architectural optimization. Experiments across a variety of datasets show the benefit of dynamically constrained equivariance to find effective architectures with approximate equivariance.

Via

Access Paper or Ask Questions

On Neural Consolidation for Transfer in Reinforcement Learning

Oct 05, 2022

Valentin Guillet, Dennis G. Wilson, Carlos Aguilar-Melchor, Emmanuel Rachelson

Figure 1 for On Neural Consolidation for Transfer in Reinforcement Learning

Figure 2 for On Neural Consolidation for Transfer in Reinforcement Learning

Figure 3 for On Neural Consolidation for Transfer in Reinforcement Learning

Figure 4 for On Neural Consolidation for Transfer in Reinforcement Learning

Abstract:Although transfer learning is considered to be a milestone in deep reinforcement learning, the mechanisms behind it are still poorly understood. In particular, predicting if knowledge can be transferred between two given tasks is still an unresolved problem. In this work, we explore the use of network distillation as a feature extraction method to better understand the context in which transfer can occur. Notably, we show that distillation does not prevent knowledge transfer, including when transferring from multiple tasks to a new one, and we compare these results with transfer without prior distillation. We focus our work on the Atari benchmark due to the variability between different games, but also to their similarities in terms of visual features.

* Published at the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (IEEE ADPRL), 2022

Via

Access Paper or Ask Questions

Neural Distillation as a State Representation Bottleneck in Reinforcement Learning

Oct 05, 2022

Valentin Guillet, Dennis G. Wilson, Carlos Aguilar-Melchor, Emmanuel Rachelson

Figure 1 for Neural Distillation as a State Representation Bottleneck in Reinforcement Learning

Figure 2 for Neural Distillation as a State Representation Bottleneck in Reinforcement Learning

Figure 3 for Neural Distillation as a State Representation Bottleneck in Reinforcement Learning

Figure 4 for Neural Distillation as a State Representation Bottleneck in Reinforcement Learning

Abstract:Learning a good state representation is a critical skill when dealing with multiple tasks in Reinforcement Learning as it allows for transfer and better generalization between tasks. However, defining what constitute a useful representation is far from simple and there is so far no standard method to find such an encoding. In this paper, we argue that distillation -- a process that aims at imitating a set of given policies with a single neural network -- can be used to learn a state representation displaying favorable characteristics. In this regard, we define three criteria that measure desirable features of a state encoding: the ability to select important variables in the input space, the ability to efficiently separate states according to their corresponding optimal action, and the robustness of the state encoding on new tasks. We first evaluate these criteria and verify the contribution of distillation on state representation on a toy environment based on the standard inverted pendulum problem, before extending our analysis on more complex visual tasks from the Atari and Procgen benchmarks.

* Published at the 1st Conference on Lifelong Learning Agents (CoLLAs), 2022

Via

Access Paper or Ask Questions