Abstract:Quantifying model uncertainty is critical for understanding prediction reliability, yet distinguishing between aleatoric and epistemic uncertainty remains challenging. We extend recent work from classification to regression to provide a novel frequentist approach to epistemic and aleatoric uncertainty estimation. We train models to generate conditional predictions by feeding their initial output back as an additional input. This method allows for a rigorous measurement of model uncertainty by observing how prediction responses change when conditioned on the model's previous answer. We provide a complete theoretical framework to analyze epistemic uncertainty in regression in a frequentist way, and explain how it can be exploited in practice to gauge a model's uncertainty, with minimal changes to the original architecture.
Abstract:This paper addresses a new active learning strategy for regression problems. The presented Wasserstein active regression model is based on the principles of distribution-matching to measure the representativeness of the labeled dataset. The Wasserstein distance is computed using GroupSort Neural Networks. The use of such networks provides theoretical foundations giving a way to quantify errors with explicit bounds for their size and depth. This solution is combined with another uncertainty-based approach that is more outlier-tolerant to complete the query strategy. Finally, this method is compared with other classical and recent solutions. The study empirically shows the pertinence of such a representativity-uncertainty approach, which provides good estimation all along the query procedure. Moreover, the Wasserstein active regression often achieves more precise estimations and tends to improve accuracy faster than other models.