Abstract:Given a default distribution $P$ and a set of test data $x^M=\{x_1,x_2,\ldots,x_M\}$ this paper seeks to answer the question if it was likely that $x^M$ was generated by $P$. For discrete distributions, the definitive answer is in principle given by Kolmogorov-Martin-L\"{o}f randomness. In this paper we seek to generalize this to continuous distributions. We consider a set of statistics $T_1(x^M),T_2(x^M),\ldots$. To each statistic we associate its maximum entropy distribution and with this a universal source coder. The maximum entropy distributions are subsequently combined to give a total codelength, which is compared with $-\log P(x^M)$. We show that this approach satisfied a number of theoretical properties. For real world data $P$ usually is unknown. We transform data into a standard distribution in the latent space using a bidirectional generate network and use maximum entropy coding there. We compare the resulting method to other methods that also used generative neural networks to detect anomalies. In most cases, our results show better performance.
Abstract:Heart rate variability (HRV) is a vital measure of the autonomic nervous system functionality and a key indicator of cardiovascular condition. This paper proposes a novel method, called pattern tree which is an extension of Willem's context tree to real-valued data, to investigate HRV via an atypicality framework. In a previous paper atypicality was developed as method for mining and discovery in "Big Data," which requires a universal approach. Using the proposed pattern tree as a universal source coder in this framework led to discovery of arrhythmias and unknown patterns in HRV Holter Monitoring.