Abstract:Computer Vision practitioners must thoroughly understand their model's performance, but conditional evaluation is complex and error-prone. In biometric verification, model performance over continuous covariates---real-number attributes of images that affect performance---is particularly challenging to study. We develop a generative model of the match and non-match score distributions over continuous covariates and perform inference with modern Bayesian methods. We use mixture models to capture arbitrary distributions and local basis functions to capture non-linear, multivariate trends. Three experiments demonstrate the accuracy and effectiveness of our approach. First, we study the relationship between age and face verification performance and find previous methods may overstate performance and confidence. Second, we study preprocessing for CNNs and find a highly non-linear, multivariate surface of model performance. Our method is accurate and data efficient when evaluated against previous synthetic methods. Third, we demonstrate the novel application of our method to pedestrian tracking and calculate variable thresholds and expected performance while controlling for multiple covariates.
Abstract:Pedestrian re-identification (ReID) is the task of continuously recognising the sameindividual across time and camera views. Researchers of pedestrian ReID and theirGPUs spend enormous energy producing novel algorithms, challenging datasets,and readily accessible tools to successfully improve results on standard metrics.Yet practitioners in biometrics, surveillance, and autonomous driving have not re-alized benefits that reflect these metrics. Different detections, slight occlusions,changes in perspective, and other banal perturbations render the best neural net-works virtually useless. This work makes two contributions. First, we introducethe ReID community to a budding area of computer vision research in model eval-uation. By adapting established principles of psychophysical evaluation from psy-chology, we can quantify the performance degradation and begin research thatwill improve the utility of pedestrian ReID models; not just their performance ontest sets. Second, we introduce NuscenesReID, a challenging new ReID datasetdesigned to reflect the real world autonomous vehicle conditions in which ReIDalgorithms are used. We show that, despite performing well on existing ReIDdatasets, most models are not robust to synthetic augmentations or to the morerealistic NuscenesReID data.