Abstract:There is a growing need for algorithms and techniques capable of organizing big data in an accurate and efficient manner. Clustering, or grouping dataset elements based on similarity, can be computationally expensive, especially when employed on massive datasets to divide them into a relatively large number of groups. The task of clustering what can amount to millions (billions) of data points into thousands (millions) of clusters is referred to as $\textit{extreme clustering}$. We have devised a distributed method that can be employed to efficiently solve extreme clustering problems using a quantum annealer.