Abstract:This work introduces IsUMap, a novel manifold learning technique that enhances data representation by integrating aspects of UMAP and Isomap with Vietoris-Rips filtrations. We present a systematic and detailed construction of a metric representation for locally distorted metric spaces that captures complex data structures more accurately than the previous schemes. Our approach addresses limitations in existing methods by accommodating non-uniform data distributions and intricate local geometries. We validate its performance through extensive experiments on examples of various geometric objects and benchmark real-world datasets, demonstrating significant improvements in representation quality.
Abstract:The extensive surviving corpus of the ancient scholar Plutarch of Chaeronea (ca. 45-120 CE) also contains several texts which, according to current scholarly opinion, did not originate with him and are therefore attributed to an anonymous author Pseudo-Plutarch. These include, in particular, the work Placita Philosophorum (Quotations and Opinions of the Ancient Philosophers), which is extremely important for the history of ancient philosophy. Little is known about the identity of that anonymous author and its relation to other authors from the same period. This paper presents a BERT language model for Ancient Greek. The model discovers previously unknown statistical properties relevant to these literary, philosophical, and historical problems and can shed new light on this authorship question. In particular, the Placita Philosophorum, together with one of the other Pseudo-Plutarch texts, shows similarities with the texts written by authors from an Alexandrian context (2nd/3rd century CE).
Abstract:Topological data analysis asks when balls in a metric space $(X,d)$ intersect. Geometric data analysis asks how much balls have to be enlarged to intersect. We connect this principle to the traditional core geometric concept of curvature. This enables us, on one hand, to reconceptualize curvature and link it to the geometric notion of hyperconvexity. On the other hand, we can then also understand methods of topological data analysis from a geometric perspective.
Abstract:This position paper looks into the formation of language and shows ties between structural properties of the words in the English language and their polysemy. Using Ollivier-Ricci curvature over a large graph of synonyms to estimate polysemy it shows empirically that the words that arguably are easier to pronounce also tend to have multiple meanings.
Abstract:This paper focuses on latent representations that could effectively decompose different aspects of textual information. Using a framework of style transfer for texts, we propose several empirical methods to assess information decomposition quality. We validate these methods with several state-of-the-art textual style transfer methods. Higher quality of information decomposition corresponds to higher performance in terms of bilingual evaluation understudy (BLEU) between output and human-written reformulations.
Abstract:Many experiments have been performed that use evolutionary algorithms for learning the topology and connection weights of a neural network that controls a robot or virtual agent. These experiments are not only performed to better understand basic biological principles, but also with the hope that with further progress of the methods, they will become competitive for automatically creating robot behaviors of interest. However, current methods are limited with respect to the (Kolmogorov) complexity of evolved behavior. Using the evolution of robot trajectories as an example, we show that by adding four features, namely (1) freezing of previously evolved structure, (2) temporal scaffolding, (3) a homogeneous transfer function for output nodes, and (4) mutations that create new pathways to outputs, to standard methods for the evolution of neural networks, we can achieve an approximately linear growth of the complexity of behavior over thousands of generations. Overall, evolved complexity is up to two orders of magnitude over that achieved by standard methods in the experiments reported here, with the major limiting factor for further growth being the available run time. Thus, the set of methods proposed here promises to be a useful addition to various current neuroevolution methods.