Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rim Kaddah

Flow Models for Unbounded and Geometry-Aware Distributional Reinforcement Learning

May 07, 2025

Simo Alami C., Rim Kaddah, Jesse Read, Marie-Paule Cani

Abstract:We introduce a new architecture for Distributional Reinforcement Learning (DistRL) that models return distributions using normalizing flows. This approach enables flexible, unbounded support for return distributions, in contrast to categorical approaches like C51 that rely on fixed or bounded representations. It also offers richer modeling capacity to capture multi-modality, skewness, and tail behavior than quantile based approaches. Our method is significantly more parameter-efficient than categorical approaches. Standard metrics used to train existing models like KL divergence or Wasserstein distance either are scale insensitive or have biased sample gradients, especially when return supports do not overlap. To address this, we propose a novel surrogate for the Cram\`er distance, that is geometry-aware and computable directly from the return distribution's PDF, avoiding the costly CDF computation. We test our model on the ATARI-5 sub-benchmark and show that our approach outperforms PDF based models while remaining competitive with quantile based methods.

Via

Access Paper or Ask Questions

Transferable Deep Metric Learning for Clustering

Feb 13, 2023

Simo Alami. C, Rim Kaddah, Jesse Read

Figure 1 for Transferable Deep Metric Learning for Clustering

Figure 2 for Transferable Deep Metric Learning for Clustering

Figure 3 for Transferable Deep Metric Learning for Clustering

Figure 4 for Transferable Deep Metric Learning for Clustering

Abstract:Clustering in high dimension spaces is a difficult task; the usual distance metrics may no longer be appropriate under the curse of dimensionality. Indeed, the choice of the metric is crucial, and it is highly dependent on the dataset characteristics. However a single metric could be used to correctly perform clustering on multiple datasets of different domains. We propose to do so, providing a framework for learning a transferable metric. We show that we can learn a metric on a labelled dataset, then apply it to cluster a different dataset, using an embedding space that characterises a desired clustering in the generic sense. We learn and test such metrics on several datasets of variable complexity (synthetic, MNIST, SVHN, omniglot) and achieve results competitive with the state-of-the-art while using only a small number of labelled training datasets and shallow networks.

* Published in Symposium of Intelligent Data Analysis (IDA), 2023

Via

Access Paper or Ask Questions

Conv-NILM-Net, a causal and multi-appliance model for energy source separation

Aug 03, 2022

Mohamed Alami C., Jérémie Decock, Rim Kaddah, Jesse Read

Figure 1 for Conv-NILM-Net, a causal and multi-appliance model for energy source separation

Figure 2 for Conv-NILM-Net, a causal and multi-appliance model for energy source separation

Figure 3 for Conv-NILM-Net, a causal and multi-appliance model for energy source separation

Figure 4 for Conv-NILM-Net, a causal and multi-appliance model for energy source separation

Abstract:Non-Intrusive Load Monitoring (NILM) seeks to save energy by estimating individual appliance power usage from a single aggregate measurement. Deep neural networks have become increasingly popular in attempting to solve NILM problems. However most used models are used for Load Identification rather than online Source Separation. Among source separation models, most use a single-task learning approach in which a neural network is trained exclusively for each appliance. This strategy is computationally expensive and ignores the fact that multiple appliances can be active simultaneously and dependencies between them. The rest of models are not causal, which is important for real-time application. Inspired by Convtas-Net, a model for speech separation, we propose Conv-NILM-net, a fully convolutional framework for end-to-end NILM. Conv-NILM-net is a causal model for multi appliance source separation. Our model is tested on two real datasets REDD and UK-DALE and clearly outperforms the state of the art while keeping a significantly smaller size than the competing models.

* Published in ECMLPKDD 2022, MLBEM workshop

Via

Access Paper or Ask Questions

CAMEO: Curiosity Augmented Metropolis for Exploratory Optimal Policies

May 19, 2022

Mohamed Alami Chehboune, Fernando Llorente, Rim Kaddah, Luca Martino, Jesse Read

Figure 1 for CAMEO: Curiosity Augmented Metropolis for Exploratory Optimal Policies

Figure 2 for CAMEO: Curiosity Augmented Metropolis for Exploratory Optimal Policies

Figure 3 for CAMEO: Curiosity Augmented Metropolis for Exploratory Optimal Policies

Figure 4 for CAMEO: Curiosity Augmented Metropolis for Exploratory Optimal Policies

Abstract:Reinforcement Learning has drawn huge interest as a tool for solving optimal control problems. Solving a given problem (task or environment) involves converging towards an optimal policy. However, there might exist multiple optimal policies that can dramatically differ in their behaviour; for example, some may be faster than the others but at the expense of greater risk. We consider and study a distribution of optimal policies. We design a curiosity-augmented Metropolis algorithm (CAMEO), such that we can sample optimal policies, and such that these policies effectively adopt diverse behaviours, since this implies greater coverage of the different possible optimal policies. In experimental simulations we show that CAMEO indeed obtains policies that all solve classic control problems, and even in the challenging case of environments that provide sparse rewards. We further show that the different policies we sample present different risk profiles, corresponding to interesting practical applications in interpretability, and represents a first step towards learning the distribution of optimal policies itself.

* Published on EUSIPCO 2022 conference

Via

Access Paper or Ask Questions