Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Opitz

FAST3D: Flow-Aware Self-Training for 3D Object Detectors

Oct 18, 2021

Christian Fruhwirth-Reisinger, Michael Opitz, Horst Possegger, Horst Bischof

Figure 1 for FAST3D: Flow-Aware Self-Training for 3D Object Detectors

Figure 2 for FAST3D: Flow-Aware Self-Training for 3D Object Detectors

Figure 3 for FAST3D: Flow-Aware Self-Training for 3D Object Detectors

Figure 4 for FAST3D: Flow-Aware Self-Training for 3D Object Detectors

Abstract:In the field of autonomous driving, self-training is widely applied to mitigate distribution shifts in LiDAR-based 3D object detectors. This eliminates the need for expensive, high-quality labels whenever the environment changes (e.g., geographic location, sensor setup, weather condition). State-of-the-art self-training approaches, however, mostly ignore the temporal nature of autonomous driving data. To address this issue, we propose a flow-aware self-training method that enables unsupervised domain adaptation for 3D object detectors on continuous LiDAR point clouds. In order to get reliable pseudo-labels, we leverage scene flow to propagate detections through time. In particular, we introduce a flow-based multi-target tracker, that exploits flow consistency to filter and refine resulting tracks. The emerged precise pseudo-labels then serve as a basis for model re-training. Starting with a pre-trained KITTI model, we conduct experiments on the challenging Waymo Open Dataset to demonstrate the effectiveness of our approach. Without any prior target domain knowledge, our results show a significant improvement over the state-of-the-art.

* Accepted to BMVC 2021

Via

Access Paper or Ask Questions

Robustness of Object Detectors in Degrading Weather Conditions

Jun 16, 2021

Muhammad Jehanzeb Mirza, Cornelius Buerkle, Julio Jarquin, Michael Opitz, Fabian Oboril, Kay-Ulrich Scholl, Horst Bischof

Figure 1 for Robustness of Object Detectors in Degrading Weather Conditions

Figure 2 for Robustness of Object Detectors in Degrading Weather Conditions

Figure 3 for Robustness of Object Detectors in Degrading Weather Conditions

Figure 4 for Robustness of Object Detectors in Degrading Weather Conditions

Abstract:State-of-the-art object detection systems for autonomous driving achieve promising results in clear weather conditions. However, such autonomous safety critical systems also need to work in degrading weather conditions, such as rain, fog and snow. Unfortunately, most approaches evaluate only on the KITTI dataset, which consists only of clear weather scenes. In this paper we address this issue and perform one of the most detailed evaluation on single and dual modality architectures on data captured in real weather conditions. We analyse the performance degradation of these architectures in degrading weather conditions. We demonstrate that an object detection architecture performing good in clear weather might not be able to handle degrading weather conditions. We also perform ablation studies on the dual modality architectures and show their limitations.

* Accepted for publication at ITSC 2021

Via

Access Paper or Ask Questions

FuseSeg: LiDAR Point Cloud Segmentation Fusing Multi-Modal Data

Dec 19, 2019

Georg Krispel, Michael Opitz, Georg Waltner, Horst Possegger, Horst Bischof

Figure 1 for FuseSeg: LiDAR Point Cloud Segmentation Fusing Multi-Modal Data

Figure 2 for FuseSeg: LiDAR Point Cloud Segmentation Fusing Multi-Modal Data

Figure 3 for FuseSeg: LiDAR Point Cloud Segmentation Fusing Multi-Modal Data

Figure 4 for FuseSeg: LiDAR Point Cloud Segmentation Fusing Multi-Modal Data

Abstract:We introduce a simple yet effective fusion method of LiDAR and RGB data to segment LiDAR point clouds. Utilizing the dense native range representation of a LiDAR sensor and the setup calibration, we establish point correspondences between the two input modalities. Subsequently, we are able to warp and fuse the features from one domain into the other. Therefore, we can jointly exploit information from both data sources within one single network. To show the merit of our method, we extend SqueezeSeg, a point cloud segmentation network, with an RGB feature branch and fuse it into the original structure. Our extension called FuseSeg leads to an improvement of up to 18% IoU on the KITTI benchmark. In addition to the improved accuracy, we also achieve real-time performance at 50 fps, five times as fast as the KITTI LiDAR data recording speed.

* Accepted for publication in WACV 2020

Via

Access Paper or Ask Questions

MURAUER: Mapping Unlabeled Real Data for Label AUstERity

Dec 05, 2018

Georg Poier, Michael Opitz, David Schinagl, Horst Bischof

Figure 1 for MURAUER: Mapping Unlabeled Real Data for Label AUstERity

Figure 2 for MURAUER: Mapping Unlabeled Real Data for Label AUstERity

Figure 3 for MURAUER: Mapping Unlabeled Real Data for Label AUstERity

Figure 4 for MURAUER: Mapping Unlabeled Real Data for Label AUstERity

Abstract:Data labeling for learning 3D hand pose estimation models is a huge effort. Readily available, accurately labeled synthetic data has the potential to reduce the effort. However, to successfully exploit synthetic data, current state-of-the-art methods still require a large amount of labeled real data. In this work, we remove this requirement by learning to map from the features of real data to the features of synthetic data mainly using a large amount of synthetic and unlabeled real data. We exploit unlabeled data using two auxiliary objectives, which enforce that (i) the mapped representation is pose specific and (ii) at the same time, the distributions of real and synthetic data are aligned. While pose specifity is enforced by a self-supervisory signal requiring that the representation is predictive for the appearance from different views, distributions are aligned by an adversarial term. In this way, we can significantly improve the results of the baseline system, which does not use unlabeled data and outperform many recent approaches already with about 1% of the labeled real data. This presents a step towards faster deployment of learning based hand pose estimation, making it accessible for a larger range of applications.

* WACV 2019; Project page at https://poier.github.io/murauer

Via

Access Paper or Ask Questions

Deep 2.5D Vehicle Classification with Sparse SfM Depth Prior for Automated Toll Systems

May 11, 2018

Georg Waltner, Michael Maurer, Thomas Holzmann, Patrick Ruprecht, Michael Opitz, Horst Possegger, Friedrich Fraundorfer, Horst Bischof

Figure 1 for Deep 2.5D Vehicle Classification with Sparse SfM Depth Prior for Automated Toll Systems

Figure 2 for Deep 2.5D Vehicle Classification with Sparse SfM Depth Prior for Automated Toll Systems

Figure 3 for Deep 2.5D Vehicle Classification with Sparse SfM Depth Prior for Automated Toll Systems

Figure 4 for Deep 2.5D Vehicle Classification with Sparse SfM Depth Prior for Automated Toll Systems

Abstract:Automated toll systems rely on proper classification of the passing vehicles. This is especially difficult when the images used for classification only cover parts of the vehicle. To obtain information about the whole vehicle. we reconstruct the vehicle as 3D object and exploit this additional information within a Convolutional Neural Network (CNN). However, when using deep networks for 3D object classification, large amounts of dense 3D models are required for good accuracy, which are often neither available nor feasible to process due to memory requirements. Therefore, in our method we reproject the 3D object onto the image plane using the reconstructed points, lines or both. We utilize this sparse depth prior within an auxiliary network branch that acts as a regularizer during training. We show that this auxiliary regularizer helps to improve accuracy compared to 2D classification on a real-world dataset. Furthermore due to the design of the network, at test time only the 2D camera images are required for classification which enables the usage in portable computer vision systems.

* Submitted to the IEEE International Conference on Intelligent Transportation Systems 2018 (ITSC), 6 pages, 4 figures; changed format in compliance with adapted IEEE template

Via

Access Paper or Ask Questions

Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly

Jan 15, 2018

Michael Opitz, Georg Waltner, Horst Possegger, Horst Bischof

Figure 1 for Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly

Figure 2 for Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly

Figure 3 for Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly

Figure 4 for Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly

Abstract:Learning similarity functions between image pairs with deep neural networks yields highly correlated activations of embeddings. In this work, we show how to improve the robustness of such embeddings by exploiting the independence within ensembles. To this end, we divide the last embedding layer of a deep network into an embedding ensemble and formulate training this ensemble as an online gradient boosting problem. Each learner receives a reweighted training sample from the previous learners. Further, we propose two loss functions which increase the diversity in our ensemble. These loss functions can be applied either for weight initialization or during training. Together, our contributions leverage large embedding sizes more effectively by significantly reducing correlation of the embedding and consequently increase retrieval accuracy of the embedding. Our method works with any differentiable loss function and does not introduce any additional parameters during test time. We evaluate our metric learning method on image retrieval tasks and show that it improves over state-of-the-art methods on the CUB 200-2011, Cars-196, Stanford Online Products, In-Shop Clothes Retrieval and VehicleID datasets.

* Extension to our paper BIER: Boosting Independent Embeddings Robustly (ICCV 2017 oral) - submitted to PAMI

Via

Access Paper or Ask Questions

Grid Loss: Detecting Occluded Faces

Sep 01, 2016

Michael Opitz, Georg Waltner, Georg Poier, Horst Possegger, Horst Bischof

Figure 1 for Grid Loss: Detecting Occluded Faces

Figure 2 for Grid Loss: Detecting Occluded Faces

Figure 3 for Grid Loss: Detecting Occluded Faces

Figure 4 for Grid Loss: Detecting Occluded Faces

Abstract:Detection of partially occluded objects is a challenging computer vision problem. Standard Convolutional Neural Network (CNN) detectors fail if parts of the detection window are occluded, since not every sub-part of the window is discriminative on its own. To address this issue, we propose a novel loss layer for CNNs, named grid loss, which minimizes the error rate on sub-blocks of a convolution layer independently rather than over the whole feature map. This results in parts being more discriminative on their own, enabling the detector to recover if the detection window is partially occluded. By mapping our loss layer back to a regular fully connected layer, no additional computational cost is incurred at runtime compared to standard CNNs. We demonstrate our method for face detection on several public face detection benchmarks and show that our method outperforms regular CNNs, is suitable for realtime applications and achieves state-of-the-art performance.

* accepted to ECCV 2016

Via

Access Paper or Ask Questions