Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jonas M. B. Haslbeck

Estimating the Number of Clusters via Normalized Cluster Instability

Oct 12, 2018

Jonas M. B. Haslbeck, Dirk U. Wulff

Figure 1 for Estimating the Number of Clusters via Normalized Cluster Instability

Figure 2 for Estimating the Number of Clusters via Normalized Cluster Instability

Figure 3 for Estimating the Number of Clusters via Normalized Cluster Instability

Figure 4 for Estimating the Number of Clusters via Normalized Cluster Instability

Abstract:We improve current instability-based methods for the selection of the number of clusters $k$ in cluster analysis by developing a normalized cluster instability measure that corrects for the distribution of cluster sizes, a previously unaccounted driver of cluster instability. We show that our normalized instability measure outperforms current instability-based measures across the whole sequence of possible $k$ and especially overcomes limitations in the context of large $k$. We also compare, for the first time, model-based and model-free approaches to determine cluster-instability and find their performance to be comparable. We make our method available in the R-package \verb+cstab+.

Via

Access Paper or Ask Questions