Abstract:We present a consistent approach to density-based clustering, which satisfies a stability theorem that holds without any distributional assumptions. We also show that the algorithm can be combined with standard procedures to extract a flat clustering from a hierarchical clustering, and that the resulting flat clustering algorithms satisfy stability theorems. The algorithms and proofs are inspired by topological data analysis.
Abstract:We propose two algorithms to cluster a set of clusterings of a fixed dataset, such as sets of clusterings produced by running a clustering algorithm with a range of parameters, or with many initializations. We use these to study the effects of varying the parameters of HDBSCAN, and to study methods for initializing $k$-means.