Graz University of Technology
Abstract:eXplainable Artificial Intelligence (XAI) aims at providing understandable explanations of black box models. In this paper, we evaluate current XAI methods by scoring them based on ground truth simulations and sensitivity analysis. To this end, we used an Electric Arc Furnace (EAF) model to better understand the limits and robustness characteristics of XAI methods such as SHapley Additive exPlanations (SHAP), Local Interpretable Model-agnostic Explanations (LIME), as well as Averaged Local Effects (ALE) or Smooth Gradients (SG) in a highly topical setting. These XAI methods were applied to various types of black-box models and then scored based on their correctness compared to the ground-truth sensitivity of the data-generating processes using a novel scoring evaluation methodology over a range of simulated additive noise. The resulting evaluation shows that the capability of the Machine Learning (ML) models to capture the process accurately is, indeed, coupled with the correctness of the explainability of the underlying data-generating process. We furthermore show the differences between XAI methods in their ability to correctly predict the true sensitivity of the modeled industrial process.
Abstract:The Bethe free energy approximation provides an effective way for relaxing NP-hard problems of probabilistic inference. However, its accuracy depends on the model parameters and particularly degrades if a phase transition in the model occurs. In this work, we analyze when the Bethe approximation is reliable and how this can be verified. We argue and show by experiment that it is mostly accurate if it is convex on a submanifold of its domain, the 'Bethe box'. For verifying its convexity, we derive two sufficient conditions that are based on the definiteness properties of the Bethe Hessian matrix: the first uses the concept of diagonal dominance, and the second decomposes the Bethe Hessian matrix into a sum of sparse matrices and characterizes the definiteness properties of the individual matrices in that sum. These theoretical results provide a simple way to estimate the critical phase transition temperature of a model. As a practical contribution we propose $\texttt{BETHE-MIN}$, a projected quasi-Newton method to efficiently find a minimum of the Bethe free energy.
Abstract:Bayesian causal inference, i.e., inferring a posterior over causal models for the use in downstream causal reasoning tasks, poses a hard computational inference problem that is little explored in literature. In this work, we combine techniques from order-based MCMC structure learning with recent advances in gradient-based graph learning into an effective Bayesian causal inference framework. Specifically, we decompose the problem of inferring the causal structure into (i) inferring a topological order over variables and (ii) inferring the parent sets for each variable. When limiting the number of parents per variable, we can exactly marginalise over the parent sets in polynomial time. We further use Gaussian processes to model the unknown causal mechanisms, which also allows their exact marginalisation. This introduces a Rao-Blackwellization scheme, where all components are eliminated from the model, except for the causal order, for which we learn a distribution via gradient-based optimisation. The combination of Rao-Blackwellization with our sequential inference procedure for causal orders yields state-of-the-art on linear and non-linear additive noise benchmarks with scale-free and Erdos-Renyi graph structures.
Abstract:In this paper we propose a new method for training neural networks (NNs) for frequency modulated continuous wave (FMCW) radar mutual interference mitigation. Instead of training NNs to regress from interfered to clean radar signals as in previous work, we train NNs directly on object detection maps. We do so by performing a continuous relaxation of the cell-averaging constant false alarm rate (CA-CFAR) peak detector, which is a well-established algorithm for object detection using radar. With this new training objective we are able to increase object detection performance by a large margin. Furthermore, we introduce separable convolution kernels to strongly reduce the number of parameters and computational complexity of convolutional NN architectures for radar applications. We validate our contributions with experiments on real-world measurement data and compare them against signal processing interference mitigation methods.
Abstract:We present a data-driven car occupancy detection algorithm using ultra-wideband radar based on the ResNet architecture. The algorithm is trained on a dataset of channel impulse responses obtained from measurements at three different activity levels of the occupants (i.e. breathing, talking, moving). We compare the presented algorithm against a state-of-the-art car occupancy detection algorithm based on variational message passing (VMP). Our presented ResNet architecture is able to outperform the VMP algorithm in terms of the area under the receiver operating curve (AUC) at low signal-to-noise ratios (SNRs) for all three activity levels of the target. Specifically, for an SNR of -20 dB the VMP detector achieves an AUC of 0.87 while the ResNet architecture achieves an AUC of 0.91 if the target is sitting still and breathing naturally. The difference in performance for the other activities is similar. To facilitate the implementation in the onboard computer of a car we perform an ablation study to optimize the tradeoff between performance and computational complexity for several ResNet architectures. The dataset used to train and evaluate the algorithm is openly accessible. This facilitates an easy comparison in future works.
Abstract:We present a fast update rule for variational block-sparse Bayesian learning (SBL) methods. Using a variational Bayesian framework, we show how repeated updates of probability density functions (PDFs) of the prior variances and weights can be expressed as a nonlinear first-order recurrence from one estimate of the parameters of the proxy PDFs to the next. Specifically, the recurrent relation turns out to be a strictly increasing rational function for many commonly used prior PDFs of the variances, such as Jeffrey's prior. Hence, the fixed points of this recurrent relation can be obtained by solving for the roots of a polynomial. This scheme allows to check for convergence/divergence of individual prior variances in a single step. Thereby, the the computational complexity of the variational block-SBL algorithm is reduced and the convergence speed is improved by two orders of magnitude in our simulations. Furthermore, the solution allows insights into the sparsity of the estimators obtained by choosing different priors.
Abstract:Multiple-Input Multiple-Output (MIMO) systems are essential for wireless communications. Sinceclassical algorithms for symbol detection in MIMO setups require large computational resourcesor provide poor results, data-driven algorithms are becoming more popular. Most of the proposedalgorithms, however, introduce approximations leading to degraded performance for realistic MIMOsystems. In this paper, we introduce a neural-enhanced hybrid model, augmenting the analyticbackbone algorithm with state-of-the-art neural network components. In particular, we introduce aself-attention model for the enhancement of the iterative Orthogonal Approximate Message Passing(OAMP)-based decoding algorithm. In our experiments, we show that the proposed model canoutperform existing data-driven approaches for OAMP while having improved generalization to otherSNR values at limited computational overhead.
Abstract:In this paper, we present a variational inference algorithm that decomposes a signal into multiple groups of related spectral lines. The spectral lines in each group are associated with a group parameter common to all spectral lines within the group. The proposed algorithm jointly estimates the group parameters, the number of spetral lines within a group, and the number of groups exploiting a Bernoulli-Gamma-Gaussian hierarchical prior model which promotes sparse solutions. Aiming to maximize the evidence lower bound (ELBO), variational inference provides analytic approximations of the posterior probability density functions (PDFs) and also gives estimates of the additional model parameters such as the measurement noise variance. While the activation variables of the groups and the associated group parameters (such as fundamental frequencies and the corresponding higher order harmonics) are estimated as point estimates, the remaining parameters such as the complex amplitudes of the spectral lines and their precision parameters are estimated as approximate posterior PDFs. We demonstrate the versatility and performance of the proposed algorithm on three different inference problems. In particular, the proposed algorithm is applied to the multi-pitch estimation problem, the radar signal-based extended object estimation problem, and variational mode decomposition (VMD) using synthetic measurements and to real multi-pitch estimation problem using the Bach-10 dataset. The results show that the proposed algorithm outperforms state-of-the-art model-based and pre-trained algorithms on all three inference problems.
Abstract:We present a variational message passing (VMP) approach to detect the presence of a person based on their respiratory chest motion using ultra-wideband (UWB) radar and to estimate the respiratory motion for contact-free vital sign monitoring. The received signal is modeled by a backscatter channel. The respiratory motion and propagation channel are estimated using VMP, while the presence of a person is detected by the evidence lower bound (ELBO). Numerical analyses and measurements demonstrate that the proposed method leads to a significant improvement in the detection performance compared to a fast fourier transform (FFT)-based detector or an estimator-correlator, since the multipath components (MPCs) are better incorporated into the detection procedure. Specifically, the proposed method has a detection probability of 0.95 at -20 dB signal-to-noise ratio (SNR), while the estimator-correlator and FFT-based detector have 0.32 and 0.05, respectively.
Abstract:Causal discovery and causal reasoning are classically treated as separate and consecutive tasks: one first infers the causal graph, and then uses it to estimate causal effects of interventions. However, such a two-stage approach is uneconomical, especially in terms of actively collected interventional data, since the causal query of interest may not require a fully-specified causal model. From a Bayesian perspective, it is also unnatural, since a causal query (e.g., the causal graph or some causal effect) can be viewed as a latent quantity subject to posterior inference -- other unobserved quantities that are not of direct interest (e.g., the full causal model) ought to be marginalized out in this process and contribute to our epistemic uncertainty. In this work, we propose Active Bayesian Causal Inference (ABCI), a fully-Bayesian active learning framework for integrated causal discovery and reasoning, which jointly infers a posterior over causal models and queries of interest. In our approach to ABCI, we focus on the class of causally-sufficient, nonlinear additive noise models, which we model using Gaussian processes. We sequentially design experiments that are maximally informative about our target causal query, collect the corresponding interventional data, and update our beliefs to choose the next experiment. Through simulations, we demonstrate that our approach is more data-efficient than several baselines that only focus on learning the full causal graph. This allows us to accurately learn downstream causal queries from fewer samples while providing well-calibrated uncertainty estimates for the quantities of interest.