Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ola Ahmad

ProMi: An Efficient Prototype-Mixture Baseline for Few-Shot Segmentation with Bounding-Box Annotations

May 18, 2025

Florent Chiaroni, Ali Ayub, Ola Ahmad

Abstract:In robotics applications, few-shot segmentation is crucial because it allows robots to perform complex tasks with minimal training data, facilitating their adaptation to diverse, real-world environments. However, pixel-level annotations of even small amount of images is highly time-consuming and costly. In this paper, we present a novel few-shot binary segmentation method based on bounding-box annotations instead of pixel-level labels. We introduce, ProMi, an efficient prototype-mixture-based method that treats the background class as a mixture of distributions. Our approach is simple, training-free, and effective, accommodating coarse annotations with ease. Compared to existing baselines, ProMi achieves the best results across different datasets with significant gains, demonstrating its effectiveness. Furthermore, we present qualitative experiments tailored to real-world mobile robot tasks, demonstrating the applicability of our approach in such scenarios. Our code: https://github.com/ThalesGroup/promi.

Via

Access Paper or Ask Questions

QuaCK-TSF: Quantum-Classical Kernelized Time Series Forecasting

Aug 21, 2024

Abdallah Aaraba, Soumaya Cherkaoui, Ola Ahmad, Jean-Frédéric Laprade, Olivier Nahman-Lévesque, Alexis Vieloszynski, Shengrui Wang

Abstract:Forecasting in probabilistic time series is a complex endeavor that extends beyond predicting future values to also quantifying the uncertainty inherent in these predictions. Gaussian process regression stands out as a Bayesian machine learning technique adept at addressing this multifaceted challenge. This paper introduces a novel approach that blends the robustness of this Bayesian technique with the nuanced insights provided by the kernel perspective on quantum models, aimed at advancing quantum kernelized probabilistic forecasting. We incorporate a quantum feature map inspired by Ising interactions and demonstrate its effectiveness in capturing the temporal dependencies critical for precise forecasting. The optimization of our model's hyperparameters circumvents the need for computationally intensive gradient descent by employing gradient-free Bayesian optimization. Comparative benchmarks against established classical kernel models are provided, affirming that our quantum-enhanced approach achieves competitive performance.

* 12 pages, 15 figures, to be published in IEEE Quantum Week 2024's conference proceeding

Via

Access Paper or Ask Questions

DarSwin-Unet: Distortion Aware Encoder-Decoder Architecture

Jul 24, 2024

Akshaya Athwale, Ichrak Shili, Émile Bergeron, Ola Ahmad, Jean-François Lalonde

Figure 1 for DarSwin-Unet: Distortion Aware Encoder-Decoder Architecture

Figure 2 for DarSwin-Unet: Distortion Aware Encoder-Decoder Architecture

Figure 3 for DarSwin-Unet: Distortion Aware Encoder-Decoder Architecture

Figure 4 for DarSwin-Unet: Distortion Aware Encoder-Decoder Architecture

Abstract:Wide-angle fisheye images are becoming increasingly common for perception tasks in applications such as robotics, security, and mobility (e.g. drones, avionics). However, current models often either ignore the distortions in wide-angle images or are not suitable to perform pixel-level tasks. In this paper, we present an encoder-decoder model based on a radial transformer architecture that adapts to distortions in wide-angle lenses by leveraging the physical characteristics defined by the radial distortion profile. In contrast to the original model, which only performs classification tasks, we introduce a U-Net architecture, DarSwin-Unet, designed for pixel level tasks. Furthermore, we propose a novel strategy that minimizes sparsity when sampling the image for creating its input tokens. Our approach enhances the model capability to handle pixel-level tasks in wide-angle fisheye images, making it more effective for real-world applications. Compared to other baselines, DarSwin-Unet achieves the best results across different datasets, with significant gains when trained on bounded levels of distortions (very low, low, medium, and high) and tested on all, including out-of-distribution distortions. We demonstrate its performance on depth estimation and show through extensive experiments that DarSwin-Unet can perform zero-shot adaptation to unseen distortions of different wide-angle lenses.

Via

Access Paper or Ask Questions

Neural Active Learning Meets the Partial Monitoring Framework

May 14, 2024

Maxime Heuillet, Ola Ahmad, Audrey Durand

Abstract:We focus on the online-based active learning (OAL) setting where an agent operates over a stream of observations and trades-off between the costly acquisition of information (labelled observations) and the cost of prediction errors. We propose a novel foundation for OAL tasks based on partial monitoring, a theoretical framework specialized in online learning from partially informative actions. We show that previously studied binary and multi-class OAL tasks are instances of partial monitoring. We expand the real-world potential of OAL by introducing a new class of cost-sensitive OAL tasks. We propose NeuralCBP, the first PM strategy that accounts for predictive uncertainty with deep neural networks. Our extensive empirical evaluation on open source datasets shows that NeuralCBP has favorable performance against state-of-the-art baselines on multiple binary, multi-class and cost-sensitive OAL tasks.

Via

Access Paper or Ask Questions

Randomized Confidence Bounds for Stochastic Partial Monitoring

Feb 07, 2024

Maxime Heuillet, Ola Ahmad, Audrey Durand

Figure 1 for Randomized Confidence Bounds for Stochastic Partial Monitoring

Figure 2 for Randomized Confidence Bounds for Stochastic Partial Monitoring

Figure 3 for Randomized Confidence Bounds for Stochastic Partial Monitoring

Figure 4 for Randomized Confidence Bounds for Stochastic Partial Monitoring

Abstract:The partial monitoring (PM) framework provides a theoretical formulation of sequential learning problems with incomplete feedback. On each round, a learning agent plays an action while the environment simultaneously chooses an outcome. The agent then observes a feedback signal that is only partially informative about the (unobserved) outcome. The agent leverages the received feedback signals to select actions that minimize the (unobserved) cumulative loss. In contextual PM, the outcomes depend on some side information that is observable by the agent before selecting the action on each round. In this paper, we consider the contextual and non-contextual PM settings with stochastic outcomes. We introduce a new class of strategies based on the randomization of deterministic confidence bounds, that extend regret guarantees to settings where existing stochastic strategies are not applicable. Our experiments show that the proposed RandCBP and RandCBPside* strategies improve state-of-the-art baselines in PM games. To encourage the adoption of the PM framework, we design a use case on the real-world problem of monitoring the error rate of any deployed classification system.

Via

Access Paper or Ask Questions

Successive Data Injection in Conditional Quantum GAN Applied to Time Series Anomaly Detection

Oct 08, 2023

Benjamin Kalfon, Soumaya Cherkaoui, Jean-Frédéric Laprade, Ola Ahmad, Shengrui Wang

Abstract:Classical GAN architectures have shown interesting results for solving anomaly detection problems in general and for time series anomalies in particular, such as those arising in communication networks. In recent years, several quantum GAN architectures have been proposed in the literature. When detecting anomalies in time series using QGANs, huge challenges arise due to the limited number of qubits compared to the size of the data. To address these challenges, we propose a new high-dimensional encoding approach, named Successive Data Injection (SuDaI). In this approach, we explore a larger portion of the quantum state than that in the conventional angle encoding, the method used predominantly in the literature, through repeated data injections into the quantum state. SuDaI encoding allows us to adapt the QGAN for anomaly detection with network data of a much higher dimensionality than with the existing known QGANs implementations. In addition, SuDaI encoding applies to other types of high-dimensional time series and can be used in contexts beyond anomaly detection and QGANs, opening up therefore multiple fields of application.

Via

Access Paper or Ask Questions

MoP-CLIP: A Mixture of Prompt-Tuned CLIP Models for Domain Incremental Learning

Jul 11, 2023

Julien Nicolas, Florent Chiaroni, Imtiaz Ziko, Ola Ahmad, Christian Desrosiers, Jose Dolz

Abstract:Despite the recent progress in incremental learning, addressing catastrophic forgetting under distributional drift is still an open and important problem. Indeed, while state-of-the-art domain incremental learning (DIL) methods perform satisfactorily within known domains, their performance largely degrades in the presence of novel domains. This limitation hampers their generalizability, and restricts their scalability to more realistic settings where train and test data are drawn from different distributions. To address these limitations, we present a novel DIL approach based on a mixture of prompt-tuned CLIP models (MoP-CLIP), which generalizes the paradigm of S-Prompting to handle both in-distribution and out-of-distribution data at inference. In particular, at the training stage we model the features distribution of every class in each domain, learning individual text and visual prompts to adapt to a given domain. At inference, the learned distributions allow us to identify whether a given test sample belongs to a known domain, selecting the correct prompt for the classification task, or from an unseen domain, leveraging a mixture of the prompt-tuned CLIP models. Our empirical evaluation reveals the poor performance of existing DIL methods under domain shift, and suggests that the proposed MoP-CLIP performs competitively in the standard DIL settings while outperforming state-of-the-art methods in OOD scenarios. These results demonstrate the superiority of MoP-CLIP, offering a robust and general solution to the problem of domain incremental learning.

* 13 pages, 5 figures

Via

Access Paper or Ask Questions

Causal Analysis for Robust Interpretability of Neural Networks

May 15, 2023

Ola Ahmad, Nicolas Bereux, Vahid Hashemi, Freddy Lecue

Figure 1 for Causal Analysis for Robust Interpretability of Neural Networks

Figure 2 for Causal Analysis for Robust Interpretability of Neural Networks

Figure 3 for Causal Analysis for Robust Interpretability of Neural Networks

Figure 4 for Causal Analysis for Robust Interpretability of Neural Networks

Abstract:Interpreting the inner function of neural networks is crucial for the trustworthy development and deployment of these black-box models. Prior interpretability methods focus on correlation-based measures to attribute model decisions to individual examples. However, these measures are susceptible to noise and spurious correlations encoded in the model during the training phase (e.g., biased inputs, model overfitting, or misspecification). Moreover, this process has proven to result in noisy and unstable attributions that prevent any transparent understanding of the model's behavior. In this paper, we develop a robust interventional-based method grounded by causal analysis to capture cause-effect mechanisms in pre-trained neural networks and their relation to the prediction. Our novel approach relies on path interventions to infer the causal mechanisms within hidden layers and isolate relevant and necessary information (to model prediction), avoiding noisy ones. The result is task-specific causal explanatory graphs that can audit model behavior and express the actual causes underlying its performance. We apply our method to vision models trained on classification tasks. On image classification tasks, we provide extensive quantitative experiments to show that our approach can capture more stable and faithful explanations than standard attribution-based methods. Furthermore, the underlying causal graphs reveal the neural interactions in the model, making it a valuable tool in other applications (e.g., model repair).

Via

Access Paper or Ask Questions

DarSwin: Distortion Aware Radial Swin Transformer

Apr 19, 2023

Akshaya Athwale, Arman Afrasiyabi, Justin Lague, Ichrak Shili, Ola Ahmad, Jean-Francois Lalonde

Figure 1 for DarSwin: Distortion Aware Radial Swin Transformer

Figure 2 for DarSwin: Distortion Aware Radial Swin Transformer

Figure 3 for DarSwin: Distortion Aware Radial Swin Transformer

Figure 4 for DarSwin: Distortion Aware Radial Swin Transformer

Abstract:Wide-angle lenses are commonly used in perception tasks requiring a large field of view. Unfortunately, these lenses produce significant distortions making conventional models that ignore the distortion effects unable to adapt to wide-angle images. In this paper, we present a novel transformer-based model that automatically adapts to the distortion produced by wide-angle lenses. We leverage the physical characteristics of such lenses, which are analytically defined by the radial distortion profile (assumed to be known), to develop a distortion aware radial swin transformer (DarSwin). In contrast to conventional transformer-based architectures, DarSwin comprises a radial patch partitioning, a distortion-based sampling technique for creating token embeddings, and a polar position encoding for radial patch merging. We validate our method on classification tasks using synthetically distorted ImageNet data and show through extensive experiments that DarSwin can perform zero-shot adaptation to unseen distortions of different wide-angle lenses. Compared to other baselines, DarSwin achieves the best results (in terms of Top-1 and -5 accuracy), when tested on in-distribution data, with almost 2% (6%) gain in Top-1 accuracy under medium (high) distortion levels, and comparable to the state-of-the-art under low and very low distortion levels (perspective-like images).

* 8 pages, 8 figures

Via

Access Paper or Ask Questions

FisheyeHDK: Hyperbolic Deformable Kernel Learning for Ultra-Wide Field-of-View Image Recognition

Mar 14, 2022

Ola Ahmad, Freddy Lecue

Figure 1 for FisheyeHDK: Hyperbolic Deformable Kernel Learning for Ultra-Wide Field-of-View Image Recognition

Figure 2 for FisheyeHDK: Hyperbolic Deformable Kernel Learning for Ultra-Wide Field-of-View Image Recognition

Figure 3 for FisheyeHDK: Hyperbolic Deformable Kernel Learning for Ultra-Wide Field-of-View Image Recognition

Figure 4 for FisheyeHDK: Hyperbolic Deformable Kernel Learning for Ultra-Wide Field-of-View Image Recognition

Abstract:Conventional convolution neural networks (CNNs) trained on narrow Field-of-View (FoV) images are the state-of-the-art approaches for object recognition tasks. Some methods proposed the adaptation of CNNs to ultra-wide FoV images by learning deformable kernels. However, they are limited by the Euclidean geometry and their accuracy degrades under strong distortions caused by fisheye projections. In this work, we demonstrate that learning the shape of convolution kernels in non-Euclidean spaces is better than existing deformable kernel methods. In particular, we propose a new approach that learns deformable kernel parameters (positions) in hyperbolic space. FisheyeHDK is a hybrid CNN architecture combining hyperbolic and Euclidean convolution layers for positions and features learning. First, we provide an intuition of hyperbolic space for wide FoV images. Using synthetic distortion profiles, we demonstrate the effectiveness of our approach. We select two datasets - Cityscapes and BDD100K 2020 - of perspective images which we transform to fisheye equivalents at different scaling factors (analog to focal lengths). Finally, we provide an experiment on data collected by a real fisheye camera. Validations and experiments show that our approach improves existing deformable kernel methods for CNN adaptation on fisheye images.

* Accepted at AAAI22

Via

Access Paper or Ask Questions