Abstract:This work proposes a semantic segmentation network that produces high-quality uncertainty estimates in a single forward pass. We exploit general representations from foundation models and unlabelled datasets through a Masked Image Modeling (MIM) approach, which is robust to augmentation hyper-parameters and simpler than previous techniques. For neural networks used in safety-critical applications, bias in the training data can lead to errors; therefore it is crucial to understand a network's limitations at run time and act accordingly. To this end, we test our proposed method on a number of test domains including the SAX Segmentation benchmark, which includes labelled test data from dense urban, rural and off-road driving domains. The proposed method consistently outperforms uncertainty estimation and Out-of-Distribution (OoD) techniques on this difficult benchmark.
Abstract:Knowing when a trained segmentation model is encountering data that is different to its training data is important. Understanding and mitigating the effects of this play an important part in their application from a performance and assurance perspective - this being a safety concern in applications such as autonomous vehicles (AVs). This work presents a segmentation network that can detect errors caused by challenging test domains without any additional annotation in a single forward pass. As annotation costs limit the diversity of labelled datasets, we use easy-to-obtain, uncurated and unlabelled data to learn to perform uncertainty estimation by selectively enforcing consistency over data augmentation. To this end, a novel segmentation benchmark based on the SAX Dataset is used, which includes labelled test data spanning three autonomous-driving domains, ranging in appearance from dense urban to off-road. The proposed method, named Gamma-SSL, consistently outperforms uncertainty estimation and Out-of-Distribution (OoD) techniques on this difficult benchmark - by up to 10.7% in area under the receiver operating characteristic (ROC) curve and 19.2% in area under the precision-recall (PR) curve in the most challenging of the three scenarios.