Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jacques Kaiser

Highly Accurate and Diverse Traffic Data: The DeepScenario Open 3D Dataset

Apr 24, 2025

Oussema Dhaouadi, Johannes Meier, Luca Wahl, Jacques Kaiser, Luca Scalerandi, Nick Wandelburg, Zhuolun Zhou, Nijanthan Berinpanathan, Holger Banzhaf, Daniel Cremers

Abstract:Accurate 3D trajectory data is crucial for advancing autonomous driving. Yet, traditional datasets are usually captured by fixed sensors mounted on a car and are susceptible to occlusion. Additionally, such an approach can precisely reconstruct the dynamic environment in the close vicinity of the measurement vehicle only, while neglecting objects that are further away. In this paper, we introduce the DeepScenario Open 3D Dataset (DSC3D), a high-quality, occlusion-free dataset of 6 degrees of freedom bounding box trajectories acquired through a novel monocular camera drone tracking pipeline. Our dataset includes more than 175,000 trajectories of 14 types of traffic participants and significantly exceeds existing datasets in terms of diversity and scale, containing many unprecedented scenarios such as complex vehicle-pedestrian interaction on highly populated urban streets and comprehensive parking maneuvers from entry to exit. DSC3D dataset was captured in five various locations in Europe and the United States and include: a parking lot, a crowded inner-city, a steep urban intersection, a federal highway, and a suburban intersection. Our 3D trajectory dataset aims to enhance autonomous driving systems by providing detailed environmental 3D representations, which could lead to improved obstacle interactions and safety. We demonstrate its utility across multiple applications including motion prediction, motion planning, scenario mining, and generative reactive traffic agents. Our interactive online visualization platform and the complete dataset are publicly available at app.deepscenario.com, facilitating research in motion prediction, behavior modeling, and safety validation.

Via

Access Paper or Ask Questions

Shape Your Ground: Refining Road Surfaces Beyond Planar Representations

Apr 15, 2025

Oussema Dhaouadi, Johannes Meier, Jacques Kaiser, Daniel Cremers

Abstract:Road surface reconstruction from aerial images is fundamental for autonomous driving, urban planning, and virtual simulation, where smoothness, compactness, and accuracy are critical quality factors. Existing reconstruction methods often produce artifacts and inconsistencies that limit usability, while downstream tasks have a tendency to represent roads as planes for simplicity but at the cost of accuracy. We introduce FlexRoad, the first framework to directly address road surface smoothing by fitting Non-Uniform Rational B-Splines (NURBS) surfaces to 3D road points obtained from photogrammetric reconstructions or geodata providers. Our method at its core utilizes the Elevation-Constrained Spatial Road Clustering (ECSRC) algorithm for robust anomaly correction, significantly reducing surface roughness and fitting errors. To facilitate quantitative comparison between road surface reconstruction methods, we present GeoRoad Dataset (GeRoD), a diverse collection of road surface and terrain profiles derived from openly accessible geodata. Experiments on GeRoD and the photogrammetry-based DeepScenario Open 3D Dataset (DSC3D) demonstrate that FlexRoad considerably surpasses commonly used road surface representations across various metrics while being insensitive to various input sources, terrains, and noise types. By performing ablation studies, we identify the key role of each component towards high-quality reconstruction performance, making FlexRoad a generic method for realistic road surface modeling.

Via

Access Paper or Ask Questions

MonoCT: Overcoming Monocular 3D Detection Domain Shift with Consistent Teacher Models

Mar 17, 2025

Johannes Meier, Louis Inchingolo, Oussema Dhaouadi, Yan Xia, Jacques Kaiser, Daniel Cremers

Abstract:We tackle the problem of monocular 3D object detection across different sensors, environments, and camera setups. In this paper, we introduce a novel unsupervised domain adaptation approach, MonoCT, that generates highly accurate pseudo labels for self-supervision. Inspired by our observation that accurate depth estimation is critical to mitigating domain shifts, MonoCT introduces a novel Generalized Depth Enhancement (GDE) module with an ensemble concept to improve depth estimation accuracy. Moreover, we introduce a novel Pseudo Label Scoring (PLS) module by exploring inner-model consistency measurement and a Diversity Maximization (DM) strategy to further generate high-quality pseudo labels for self-training. Extensive experiments on six benchmarks show that MonoCT outperforms existing SOTA domain adaptation methods by large margins (~21% minimum for AP Mod.) and generalizes well to car, traffic camera and drone views.

* ICRA2025

Via

Access Paper or Ask Questions

CARLA Drone: Monocular 3D Object Detection from a Different Perspective

Aug 21, 2024

Johannes Meier, Luca Scalerandi, Oussema Dhaouadi, Jacques Kaiser, Nikita Araslanov, Daniel Cremers

Figure 1 for CARLA Drone: Monocular 3D Object Detection from a Different Perspective

Figure 2 for CARLA Drone: Monocular 3D Object Detection from a Different Perspective

Figure 3 for CARLA Drone: Monocular 3D Object Detection from a Different Perspective

Figure 4 for CARLA Drone: Monocular 3D Object Detection from a Different Perspective

Abstract:Existing techniques for monocular 3D detection have a serious restriction. They tend to perform well only on a limited set of benchmarks, faring well either on ego-centric car views or on traffic camera views, but rarely on both. To encourage progress, this work advocates for an extended evaluation of 3D detection frameworks across different camera perspectives. We make two key contributions. First, we introduce the CARLA Drone dataset, CDrone. Simulating drone views, it substantially expands the diversity of camera perspectives in existing benchmarks. Despite its synthetic nature, CDrone represents a real-world challenge. To show this, we confirm that previous techniques struggle to perform well both on CDrone and a real-world 3D drone dataset. Second, we develop an effective data augmentation pipeline called GroundMix. Its distinguishing element is the use of the ground for creating 3D-consistent augmentation of a training image. GroundMix significantly boosts the detection accuracy of a lightweight one-stage detector. In our expanded evaluation, we achieve the average precision on par with or substantially higher than the previous state of the art across all tested datasets.

Via

Access Paper or Ask Questions

Embodied Synaptic Plasticity with Online Reinforcement learning

Mar 03, 2020

Jacques Kaiser, Michael Hoff, Andreas Konle, J. Camilo Vasquez Tieck, David Kappel, Daniel Reichard, Anand Subramoney, Robert Legenstein, Arne Roennau, Wolfgang Maass(+1 more)

Figure 1 for Embodied Synaptic Plasticity with Online Reinforcement learning

Figure 2 for Embodied Synaptic Plasticity with Online Reinforcement learning

Figure 3 for Embodied Synaptic Plasticity with Online Reinforcement learning

Figure 4 for Embodied Synaptic Plasticity with Online Reinforcement learning

Abstract:The endeavor to understand the brain involves multiple collaborating research fields. Classically, synaptic plasticity rules derived by theoretical neuroscientists are evaluated in isolation on pattern classification tasks. This contrasts with the biological brain which purpose is to control a body in closed-loop. This paper contributes to bringing the fields of computational neuroscience and robotics closer together by integrating open-source software components from these two fields. The resulting framework allows to evaluate the validity of biologically-plausibe plasticity models in closed-loop robotics environments. We demonstrate this framework to evaluate Synaptic Plasticity with Online REinforcement learning (SPORE), a reward-learning rule based on synaptic sampling, on two visuomotor tasks: reaching and lane following. We show that SPORE is capable of learning to perform policies within the course of simulated hours for both tasks. Provisional parameter explorations indicate that the learning rate and the temperature driving the stochastic processes that govern synaptic learning dynamics need to be regulated for performance improvements to be retained. We conclude by discussing the recent deep reinforcement learning techniques which would be beneficial to increase the functionality of SPORE on visuomotor tasks.

* Frontiers in neurorobotics, volume 13, p81, 2019
* 18 pages, 5 figures, published in frontiers in neurorobotics

Via

Access Paper or Ask Questions

Embodied Neuromorphic Vision with Event-Driven Random Backpropagation

May 06, 2019

Jacques Kaiser, Alexander Friedrich, J. Camilo Vasquez Tieck, Daniel Reichard, Arne Roennau, Emre Neftci, Rüdiger Dillmann

Figure 1 for Embodied Neuromorphic Vision with Event-Driven Random Backpropagation

Figure 2 for Embodied Neuromorphic Vision with Event-Driven Random Backpropagation

Figure 3 for Embodied Neuromorphic Vision with Event-Driven Random Backpropagation

Figure 4 for Embodied Neuromorphic Vision with Event-Driven Random Backpropagation

Abstract:Spike-based communication between biological neurons is sparse and unreliable. This enables the brain to process visual information from the eyes efficiently. Taking inspiration from biology, artificial spiking neural networks coupled with silicon retinas attempt to model these computations. Recent findings in machine learning allowed the derivation of a family of powerful synaptic plasticity rules approximating backpropagation for spiking networks. Are these rules capable of processing real-world visual sensory data? In this paper, we evaluate the performance of Event-Driven Random Back-Propagation (eRBP) at learning representations from event streams provided by a Dynamic Vision Sensor (DVS). First, we show that eRBP matches state-of-the-art performance on the DvsGesture dataset with the addition of a simple covert attention mechanism. By remapping visual receptive fields relatively to the center of the motion, this attention mechanism provides translation invariance at low computational cost compared to convolutions. Second, we successfully integrate eRBP in a real robotic setup, where a robotic arm grasps objects according to detected visual affordances. In this setup, visual information is actively sensed by a DVS mounted on a robotic head performing microsaccadic eye movements. We show that our method classifies affordances within 100ms after microsaccade onset, which is comparable to human performance reported in behavioral study. Our results suggest that advances in neuromorphic technology and plasticity rules enable the development of autonomous robots operating at high speed and low energy consumption.

* v2: title update, better plots and wordings. 8 pages, 9 figures, 1 table, video: https://neurorobotics-files.net/index.php/s/sBQzWFrBPoH9Dx7

Via

Access Paper or Ask Questions

Synaptic Plasticity Dynamics for Deep Continuous Local Learning

Nov 27, 2018

Jacques Kaiser, Hesham Mostafa, Emre Neftci

Figure 1 for Synaptic Plasticity Dynamics for Deep Continuous Local Learning

Figure 2 for Synaptic Plasticity Dynamics for Deep Continuous Local Learning

Figure 3 for Synaptic Plasticity Dynamics for Deep Continuous Local Learning

Figure 4 for Synaptic Plasticity Dynamics for Deep Continuous Local Learning

Abstract:A growing body of work underlines striking similarities between spiking neural networks modeling biological networks and recurrent, binary neural networks. A relatively smaller body of work, however, discuss similarities between learning dynamics employed in deep artificial neural networks and synaptic plasticity in spiking neural networks. The challenge preventing this is largely due to the discrepancy between dynamical properties of synaptic plasticity and the requirements for gradient backpropagation. Here, we demonstrate that deep learning algorithms that locally approximate the gradient backpropagation updates using locally synthesized gradients overcome this challenge. Locally synthesized gradients were initially proposed to decouple one or more layers from the rest of the network so as to improve parallelism. Here, we exploit these properties to derive gradient-based learning rules in spiking neural networks. Our approach results in highly efficient spiking neural networks and synaptic plasticity capable of training deep neural networks. Furthermore, our method utilizes existing autodifferentation methods in machine learning frameworks to systematically derive synaptic plasticity rules from task-relevant cost functions and neural dynamics. We benchmark our approach on the MNIST and DVS Gestures dataset, and report state-of-the-art results on the latter. Our results provide continuously learning machines that are not only relevant to biology, but suggestive of a brain-inspired computer architecture that matches the performances of GPUs on target tasks.

* work under progress

Via

Access Paper or Ask Questions