Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alfredo Nazábal

Robust Variational Autoencoders for Outlier Detection in Mixed-Type Data

Jul 15, 2019

Simão Eduardo, Alfredo Nazábal, Christopher K. I. Williams, Charles Sutton

Figure 1 for Robust Variational Autoencoders for Outlier Detection in Mixed-Type Data

Figure 2 for Robust Variational Autoencoders for Outlier Detection in Mixed-Type Data

Figure 3 for Robust Variational Autoencoders for Outlier Detection in Mixed-Type Data

Figure 4 for Robust Variational Autoencoders for Outlier Detection in Mixed-Type Data

Abstract:We focus on the problem of unsupervised cell outlier detection in mixed type tabular datasets. Traditional methods for outlier detection are concerned only on detecting which rows in the dataset are outliers. However, identifying which cells in the dataset corrupt a specific row is an important problem in practice, especially in high-dimensional tables. We introduce the Robust Variational Autoencoder (RVAE), a deep generative model that learns the joint distribution of the clean data while identifying the outlier cells in the dataset. RVAE learns the probability of each cell in the dataset being an outlier, balancing the contributions of the different likelihood models in the row outlier score, making the method suitable for outlier detection in mixed type datasets. We show experimentally that the RVAE performs better than several state of the art methods in cell outlier detection for tabular datasets, while providing comparable or better results for row outlier detection.

* In submission to NeurIPS 2019

Via

Access Paper or Ask Questions

Autoencoders and Probabilistic Inference with Missing Data: An Exact Solution for The Factor Analysis Case

Sep 03, 2018

Christopher K. I. Williams, Charlie Nash, Alfredo Nazábal

Figure 1 for Autoencoders and Probabilistic Inference with Missing Data: An Exact Solution for The Factor Analysis Case

Figure 2 for Autoencoders and Probabilistic Inference with Missing Data: An Exact Solution for The Factor Analysis Case

Figure 3 for Autoencoders and Probabilistic Inference with Missing Data: An Exact Solution for The Factor Analysis Case

Figure 4 for Autoencoders and Probabilistic Inference with Missing Data: An Exact Solution for The Factor Analysis Case

Abstract:Latent variable models can be used to probabilistically "fill-in" missing data entries. The variational autoencoder architecture (Kingma and Welling, 2014; Rezende et al., 2014) includes a "recognition" or "encoder" network that infers the latent variables given the data variables. However, it is not clear how to handle missing data variables in this network. The factor analysis (FA) model is a basic autoencoder, using linear encoder and decoder networks. We show how to calculate exactly the latent posterior distribution for the factor analysis (FA) model in the presence of missing data, and note that this solution exhibits a non-trivial dependence on the pattern of missingness. We also discuss various approximations to the exact solution. Experiments compare the effectiveness of various approaches to filling in the missing data.

* 7 pages, 2 figures

Via

Access Paper or Ask Questions