Abstract:Simplicity bias is an intriguing phenomenon prevalent in various input-output maps, characterized by a preference for simpler, more regular, or symmetric outputs. Notably, these maps typically feature high-probability outputs with simple patterns, whereas complex patterns are exponentially less probable. This bias has been extensively examined and attributed to principles derived from algorithmic information theory and algorithmic probability. In a significant advancement, it has been demonstrated that the renowned logistic map $x_{k+1}=\mu x_k(1-x_k)$, and other one-dimensional maps exhibit simplicity bias when conceptualized as input-output systems. Building upon this foundational work, our research delves into the manifestations of simplicity bias within the random logistic map, specifically focusing on scenarios involving additive noise. This investigation is driven by the overarching goal of formulating a comprehensive theory for the prediction and analysis of time series.Our primary contributions are multifaceted. We discover that simplicity bias is observable in the random logistic map for specific ranges of $\mu$ and noise magnitudes. Additionally, we find that this bias persists even with the introduction of small measurement noise, though it diminishes as noise levels increase. Our studies also revisit the phenomenon of noise-induced chaos, particularly when $\mu=3.83$, revealing its characteristics through complexity-probability plots. Intriguingly, we employ the logistic map to underscore a paradoxical aspect of data analysis: more data adhering to a consistent trend can occasionally lead to reduced confidence in extrapolation predictions, challenging conventional wisdom.We propose that adopting a probability-complexity perspective in analyzing dynamical systems could significantly enrich statistical learning theories related to series prediction.
Abstract:Methods of pattern recognition and machine learning are applied extensively in science, technology, and society. Hence, any advances in related theory may translate into large-scale impact. Here we explore how algorithmic information theory, especially algorithmic probability, may aid in a machine learning task. We study a multiclass supervised classification problem, namely learning the RNA molecule sequence-to-shape map, where the different possible shapes are taken to be the classes. The primary motivation for this work is a proof of concept example, where a concrete, well-motivated machine learning task can be aided by approximations to algorithmic probability. Our approach is based on directly estimating the class (i.e., shape) probabilities from shape complexities, and using the estimated probabilities as a prior in a Gaussian process learning problem. Naturally, with a large amount of training data, the prior has no significant influence on classification accuracy, but in the very small training data regime, we show that using the prior can substantially improve classification accuracy. To our knowledge, this work is one of the first to demonstrate how algorithmic probability can aid in a concrete, real-world, machine learning problem.