Abstract:Radar sensors are low cost, long-range, and weather-resilient. Therefore, they are widely used for driver assistance functions, and are expected to be crucial for the success of autonomous driving in the future. In many perception tasks only pre-processed radar point clouds are considered. In contrast, radar spectra are a raw form of radar measurements and contain more information than radar point clouds. However, radar spectra are rather difficult to interpret. In this work, we aim to explore the semantic information contained in spectra in the context of automated driving, thereby moving towards better interpretability of radar spectra. To this end, we create a radar spectra-language model, allowing us to query radar spectra measurements for the presence of scene elements using free text. We overcome the scarcity of radar spectra data by matching the embedding space of an existing vision-language model (VLM). Finally, we explore the benefit of the learned representation for scene parsing, and obtain improvements in free space segmentation and object detection merely by injecting the spectra embedding into a baseline model.
Abstract:We introduce BIDCD - the Bosch Industrial Depth Completion Dataset. BIDCD is a new RGBD dataset of metallic industrial objects, collected with a depth camera mounted on a robotic manipulator. The main purpose of this dataset is to facilitate the training of domain-specific depth completion models, to be used in logistics and manufacturing tasks. We trained a State-of-the-Art depth completion model on this dataset, and report the results, setting an initial benchmark.
Abstract:Depth cameras are a prominent perception system for robotics, especially when operating in natural unstructured environments. Industrial applications, however, typically involve reflective objects under harsh lighting conditions, a challenging scenario for depth cameras, as it induces numerous reflections and deflections, leading to loss of robustness and deteriorated accuracy. Here, we developed a deep model to correct the depth channel in RGBD images, aiming to restore the depth information to the required accuracy. To train the model, we created a novel industrial dataset that we now present to the public. The data was collected with low-end depth cameras and the ground truth depth was generated by multi-view fusion.