Abstract:Persistent homology is one of the most popular methods in Topological Data Analysis. An initial step in any analysis with persistent homology involves constructing a nested sequence of simplicial complexes, called a filtration, from a point cloud. There is an abundance of different complexes to choose from, with Rips, Alpha, and witness complexes being popular choices. In this manuscript, we build a different type of a geometrically-informed simplicial complex, called an ellipsoid complex. This complex is based on the idea that ellipsoids aligned with tangent directions better approximate the data compared to conventional (Euclidean) balls centered at sample points that are used in the construction of Rips and Alpha complexes, for instance. We use Principal Component Analysis to estimate tangent spaces directly from samples and present algorithms as well as an implementation for computing ellipsoid barcodes, i.e., topological descriptors based on ellipsoid complexes. Furthermore, we conduct extensive experiments and compare ellipsoid barcodes with standard Rips barcodes. Our findings indicate that ellipsoid complexes are particularly effective for estimating homology of manifolds and spaces with bottlenecks from samples. In particular, the persistence intervals corresponding to a ground-truth topological feature are longer compared to the intervals obtained when using the Rips complex of the data. Furthermore, ellipsoid barcodes lead to better classification results in sparsely-sampled point clouds. Finally, we demonstrate that ellipsoid barcodes outperform Rips barcodes in classification tasks.