Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Julien Moreau

Uncertainty-Aware Online Extrinsic Calibration: A Conformal Prediction Approach

Jan 12, 2025

Mathieu Cocheteux, Julien Moreau, Franck Davoine

Abstract:Accurate sensor calibration is crucial for autonomous systems, yet its uncertainty quantification remains underexplored. We present the first approach to integrate uncertainty awareness into online extrinsic calibration, combining Monte Carlo Dropout with Conformal Prediction to generate prediction intervals with a guaranteed level of coverage. Our method proposes a framework to enhance existing calibration models with uncertainty quantification, compatible with various network architectures. Validated on KITTI (RGB Camera-LiDAR) and DSEC (Event Camera-LiDAR) datasets, we demonstrate effectiveness across different visual sensor types, measuring performance with adapted metrics to evaluate the efficiency and reliability of the intervals. By providing calibration parameters with quantifiable confidence measures, we offer insights into the reliability of calibration estimates, which can greatly improve the robustness of sensor fusion in dynamic environments and usefully serve the Computer Vision community.

* Accepted for publication at WACV 2025

Via

Access Paper or Ask Questions

MULi-Ev: Maintaining Unperturbed LiDAR-Event Calibration

May 28, 2024

Mathieu Cocheteux, Julien Moreau, Franck Davoine

Abstract:Despite the increasing interest in enhancing perception systems for autonomous vehicles, the online calibration between event cameras and LiDAR - two sensors pivotal in capturing comprehensive environmental information - remains unexplored. We introduce MULi-Ev, the first online, deep learning-based framework tailored for the extrinsic calibration of event cameras with LiDAR. This advancement is instrumental for the seamless integration of LiDAR and event cameras, enabling dynamic, real-time calibration adjustments that are essential for maintaining optimal sensor alignment amidst varying operational conditions. Rigorously evaluated against the real-world scenarios presented in the DSEC dataset, MULi-Ev not only achieves substantial improvements in calibration accuracy but also sets a new standard for integrating LiDAR with event cameras in mobile platforms. Our findings reveal the potential of MULi-Ev to bolster the safety, reliability, and overall performance of event-based perception systems in autonomous driving, marking a significant step forward in their real-world deployment and effectiveness.

Via

Access Paper or Ask Questions

PseudoCal: Towards Initialisation-Free Deep Learning-Based Camera-LiDAR Self-Calibration

Sep 18, 2023

Mathieu Cocheteux, Julien Moreau, Franck Davoine

Abstract:Camera-LiDAR extrinsic calibration is a critical task for multi-sensor fusion in autonomous systems, such as self-driving vehicles and mobile robots. Traditional techniques often require manual intervention or specific environments, making them labour-intensive and error-prone. Existing deep learning-based self-calibration methods focus on small realignments and still rely on initial estimates, limiting their practicality. In this paper, we present PseudoCal, a novel self-calibration method that overcomes these limitations by leveraging the pseudo-LiDAR concept and working directly in the 3D space instead of limiting itself to the camera field of view. In typical autonomous vehicle and robotics contexts and conventions, PseudoCal is able to perform one-shot calibration quasi-independently of initial parameter estimates, addressing extreme cases that remain unsolved by existing approaches.

* British Machine Vision Conference (BMVC) 2023

Via

Access Paper or Ask Questions

Analysis over vision-based models for pedestrian action anticipation

May 27, 2023

Lina Achaji, Julien Moreau, François Aioun, François Charpillet

Abstract:Anticipating human actions in front of autonomous vehicles is a challenging task. Several papers have recently proposed model architectures to address this problem by combining multiple input features to predict pedestrian crossing actions. This paper focuses specifically on using images of the pedestrian's context as an input feature. We present several spatio-temporal model architectures that utilize standard CNN and Transformer modules to serve as a backbone for pedestrian anticipation. However, the objective of this paper is not to surpass state-of-the-art benchmarks but rather to analyze the positive and negative predictions of these models. Therefore, we provide insights on the explainability of vision-based Transformer models in the context of pedestrian action prediction. We will highlight cases where the model can achieve correct quantitative results but falls short in providing human-like explanations qualitatively, emphasizing the importance of investing in explainability for pedestrian action anticipation problems.

Via

Access Paper or Ask Questions

Learning to Estimate Two Dense Depths from LiDAR and Event Data

Feb 28, 2023

Vincent Brebion, Julien Moreau, Franck Davoine

Abstract:Event cameras do not produce images, but rather a continuous flow of events, which encode changes of illumination for each pixel independently and asynchronously. While they output temporally rich information, they lack any depth information which could facilitate their use with other sensors. LiDARs can provide this depth information, but are by nature very sparse, which makes the depth-to-event association more complex. Furthermore, as events represent changes of illumination, they might also represent changes of depth; associating them with a single depth is therefore inadequate. In this work, we propose to address these issues by fusing information from an event camera and a LiDAR using a learning-based approach to estimate accurate dense depth maps. To solve the "potential change of depth" problem, we propose here to estimate two depth maps at each step: one "before" the events happen, and one "after" the events happen. We further propose to use this pair of depths to compute a depth difference for each event, to give them more context. We train and evaluate our network, ALED, on both synthetic and real driving sequences, and show that it is able to predict dense depths with an error reduction of up to 61% compared to the current state of the art. We also demonstrate the quality of our 2-depths-to-event association, and the usefulness of the depth difference information. Finally, we release SLED, a novel synthetic dataset comprising events, LiDAR point clouds, RGB images, and dense depth maps.

* Accepted for SCIA 2023. For the project page, see https://vbrebion.github.io/ALED/

Via

Access Paper or Ask Questions

Unconventional Visual Sensors for Autonomous Vehicles

May 19, 2022

You Li, Julien Moreau, Javier Ibanez-Guzman

Figure 1 for Unconventional Visual Sensors for Autonomous Vehicles

Figure 2 for Unconventional Visual Sensors for Autonomous Vehicles

Figure 3 for Unconventional Visual Sensors for Autonomous Vehicles

Figure 4 for Unconventional Visual Sensors for Autonomous Vehicles

Abstract:Autonomous vehicles rely on perception systems to understand their surroundings for further navigation missions. Cameras are essential for perception systems due to the advantages of object detection and recognition provided by modern computer vision algorithms, comparing to other sensors, such as LiDARs and radars. However, limited by its inherent imaging principle, a standard RGB camera may perform poorly in a variety of adverse scenarios, including but not limited to: low illumination, high contrast, bad weather such as fog/rain/snow, etc. Meanwhile, estimating the 3D information from the 2D image detection is generally more difficult when compared to LiDARs or radars. Several new sensing technologies have emerged in recent years to address the limitations of conventional RGB cameras. In this paper, we review the principles of four novel image sensors: infrared cameras, range-gated cameras, polarization cameras, and event cameras. Their comparative advantages, existing or potential applications, and corresponding data processing algorithms are all presented in a systematic manner. We expect that this study will assist practitioners in the autonomous driving society with new perspectives and insights.

Via

Access Paper or Ask Questions

PreTR: Spatio-Temporal Non-Autoregressive Trajectory Prediction Transformer

Mar 17, 2022

Lina Achaji, Thierno Barry, Thibault Fouqueray, Julien Moreau, Francois Aioun, Francois Charpillet

Figure 1 for PreTR: Spatio-Temporal Non-Autoregressive Trajectory Prediction Transformer

Figure 2 for PreTR: Spatio-Temporal Non-Autoregressive Trajectory Prediction Transformer

Figure 3 for PreTR: Spatio-Temporal Non-Autoregressive Trajectory Prediction Transformer

Figure 4 for PreTR: Spatio-Temporal Non-Autoregressive Trajectory Prediction Transformer

Abstract:Nowadays, our mobility systems are evolving into the era of intelligent vehicles that aim to improve road safety. Due to their vulnerability, pedestrians are the users who will benefit the most from these developments. However, predicting their trajectory is one of the most challenging concerns. Indeed, accurate prediction requires a good understanding of multi-agent interactions that can be complex. Learning the underlying spatial and temporal patterns caused by these interactions is even more of a competitive and open problem that many researchers are tackling. In this paper, we introduce a model called PRediction Transformer (PReTR) that extracts features from the multi-agent scenes by employing a factorized spatio-temporal attention module. It shows less computational needs than previously studied models with empirically better results. Besides, previous works in motion prediction suffer from the exposure bias problem caused by generating future sequences conditioned on model prediction samples rather than ground-truth samples. In order to go beyond the proposed solutions, we leverage encoder-decoder Transformer networks for parallel decoding a set of learned object queries. This non-autoregressive solution avoids the need for iterative conditioning and arguably decreases training and testing computational time. We evaluate our model on the ETH/UCY datasets, a publicly available benchmark for pedestrian trajectory prediction. Finally, we justify our usage of the parallel decoding technique by showing that the trajectory prediction task can be better solved as a non-autoregressive task.

Via

Access Paper or Ask Questions

Real-Time Optical Flow for Vehicular Perception with Low- and High-Resolution Event Cameras

Dec 20, 2021

Vincent Brebion, Julien Moreau, Franck Davoine

Figure 1 for Real-Time Optical Flow for Vehicular Perception with Low- and High-Resolution Event Cameras

Figure 2 for Real-Time Optical Flow for Vehicular Perception with Low- and High-Resolution Event Cameras

Figure 3 for Real-Time Optical Flow for Vehicular Perception with Low- and High-Resolution Event Cameras

Figure 4 for Real-Time Optical Flow for Vehicular Perception with Low- and High-Resolution Event Cameras

Abstract:Event cameras capture changes of illumination in the observed scene rather than accumulating light to create images. Thus, they allow for applications under high-speed motion and complex lighting conditions, where traditional framebased sensors show their limits with blur and over- or underexposed pixels. Thanks to these unique properties, they represent nowadays an highly attractive sensor for ITS-related applications. Event-based optical flow (EBOF) has been studied following the rise in popularity of these neuromorphic cameras. The recent arrival of high-definition neuromorphic sensors, however, challenges the existing approaches, because of the increased resolution of the events pixel array and a much higher throughput. As an answer to these points, we propose an optimized framework for computing optical flow in real-time with both low- and high-resolution event cameras. We formulate a novel dense representation for the sparse events flow, in the form of the "inverse exponential distance surface". It serves as an interim frame, designed for the use of proven, state-of-the-art frame-based optical flow computation methods. We evaluate our approach on both low- and high-resolution driving sequences, and show that it often achieves better results than the current state of the art, while also reaching higher frame rates, 250Hz at 346 x 260 pixels and 77Hz at 1280 x 720 pixels.

* 13 pages, journal paper

Via

Access Paper or Ask Questions

Is attention to bounding boxes all you need for pedestrian action prediction?

Jul 16, 2021

Lina Achaji, Julien Moreau, Thibault Fouqueray, Francois Aioun, Francois Charpillet

Figure 1 for Is attention to bounding boxes all you need for pedestrian action prediction?

Figure 2 for Is attention to bounding boxes all you need for pedestrian action prediction?

Figure 3 for Is attention to bounding boxes all you need for pedestrian action prediction?

Figure 4 for Is attention to bounding boxes all you need for pedestrian action prediction?

Abstract:The human driver is no longer the only one concerned with the complexity of the driving scenarios. Autonomous vehicles (AV) are similarly becoming involved in the process. Nowadays, the development of AV in urban places underpins essential safety concerns for vulnerable road users (VRUs) such as pedestrians. Therefore, to make the roads safer, it is critical to classify and predict their future behavior. In this paper, we present a framework based on multiple variations of the Transformer models to reason attentively about the dynamic evolution of the pedestrians' past trajectory and predict its future actions of crossing or not crossing the street. We proved that using only bounding boxes as input to our model can outperform the previous state-of-the-art models and reach a prediction accuracy of 91 % and an F1-score of 0.83 on the PIE dataset up to two seconds ahead in the future. In addition, we introduced a large-size simulated dataset (CP2A) using CARLA for action prediction. Our model has similarly reached high accuracy (91 %) and F1-score (0.91) on this dataset. Interestingly, we showed that pre-training our Transformer model on the simulated dataset and then fine-tuning it on the real dataset can be very effective for the action prediction task.

Via

Access Paper or Ask Questions

Associative Embedding for Game-Agnostic Team Discrimination

Jul 01, 2019

Maxime Istasse, Julien Moreau, Christophe De Vleeschouwer

Figure 1 for Associative Embedding for Game-Agnostic Team Discrimination

Figure 2 for Associative Embedding for Game-Agnostic Team Discrimination

Figure 3 for Associative Embedding for Game-Agnostic Team Discrimination

Figure 4 for Associative Embedding for Game-Agnostic Team Discrimination

Abstract:Assigning team labels to players in a sport game is not a trivial task when no prior is known about the visual appearance of each team. Our work builds on a Convolutional Neural Network (CNN) to learn a descriptor, namely a pixel-wise embedding vector, that is similar for pixels depicting players from the same team, and dissimilar when pixels correspond to distinct teams. The advantage of this idea is that no per-game learning is needed, allowing efficient team discrimination as soon as the game starts. In principle, the approach follows the associative embedding framework introduced in arXiv:1611.05424 to differentiate instances of objects. Our work is however different in that it derives the embeddings from a lightweight segmentation network and, more fundamentally, because it considers the assignment of the same embedding to unconnected pixels, as required by pixels of distinct players from the same team. Excellent results, both in terms of team labelling accuracy and generalization to new games/arenas, have been achieved on panoramic views of a large variety of basketball games involving players interactions and occlusions. This makes our method a good candidate to integrate team separation in many CNN-based sport analytics pipelines.

* The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019
* Published in CVPR 2019 workshop Computer Vision in Sports, under the name "Associative Embedding for Team Discrimination" (http://openaccess.thecvf.com/content_CVPRW_2019/html/CVSports/Istasse_Associative_Embedding_for_Team_Discrimination_CVPRW_2019_paper.html)

Via

Access Paper or Ask Questions