Abstract:Current graph neural networks (GNNs) that tackle node classification on graphs tend to only focus on nodewise scores and are solely evaluated by nodewise metrics. This limits uncertainty estimation on graphs since nodewise marginals do not fully characterize the joint distribution given the graph structure. In this work, we propose novel edgewise metrics, namely the edgewise expected calibration error (ECE) and the agree/disagree ECEs, which provide criteria for uncertainty estimation on graphs beyond the nodewise setting. Our experiments demonstrate that the proposed edgewise metrics can complement the nodewise results and yield additional insights. Moreover, we show that GNN models which consider the structured prediction problem on graphs tend to have better uncertainty estimations, which illustrates the benefit of going beyond the nodewise setting.
Abstract:Given the importance of getting calibrated predictions and reliable uncertainty estimations, various post-hoc calibration methods have been developed for neural networks on standard multi-class classification tasks. However, these methods are not well suited for calibrating graph neural networks (GNNs), which presents unique challenges such as accounting for the graph structure and the graph-induced correlations between the nodes. In this work, we conduct a systematic study on the calibration qualities of GNN node predictions. In particular, we identify five factors which influence the calibration of GNNs: general under-confident tendency, diversity of nodewise predictive distributions, distance to training nodes, relative confidence level, and neighborhood similarity. Furthermore, based on the insights from this study, we design a novel calibration method named Graph Attention Temperature Scaling (GATS), which is tailored for calibrating graph neural networks. GATS incorporates designs that address all the identified influential factors and produces nodewise temperature scaling using an attention-based architecture. GATS is accuracy-preserving, data-efficient, and expressive at the same time. Our experiments empirically verify the effectiveness of GATS, demonstrating that it can consistently achieve state-of-the-art calibration results on various graph datasets for different GNN backbones.
Abstract:We propose a new method for testing antenna arrays that records the radiating electromagnetic (EM) field using an absorbing material and evaluating the resulting thermal image series through an AI using a conditional encoder-decoder model. Given the power and phase of the signals fed into each array element, we are able to reconstruct normal sequences through our trained model and compare it to the real sequences observed by a thermal camera. These thermograms only contain low-level patterns such as blobs of various shapes. A contour-based anomaly detector can then map the reconstruction error matrix to an anomaly score to identify faulty antenna arrays and increase the classification F-measure (F-M) by up to 46%. We show our approach on the time series thermograms collected by our antenna testing system. Conventionally, a variational autoencoder (VAE) learning observation noise may yield better results than a VAE with a constant noise assumption. However, we demonstrate that this is not the case for anomaly detection on such low-level patterns for two reasons. First, the baseline metric reconstruction probability, which incorporates the learned observation noise, fails to differentiate anomalous patterns. Second, the area under the receiver operating characteristic (ROC) curve of a VAE with a lower observation noise assumption achieves 11.83% higher than that of a VAE with learned noise.