Abstract:In real-world autonomous driving, deep learning models can experience performance degradation due to distributional shifts between the training data and the driving conditions encountered. As is typical in machine learning, it is difficult to acquire a large and potentially representative labeled test set to validate models in preparation for deployment in the wild. In this work, we introduce complementary learning, where we use learned characteristics from different training paradigms to detect model errors. We demonstrate our approach by learning semantic and predictive motion labels in point clouds in a supervised and self-supervised manner and detect and classify model discrepancies subsequently. We perform a large-scale qualitative analysis and present LidarCODA, the first dataset with labeled anomalies in lidar point clouds, for an extensive quantitative analysis.