Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Wasserstein-based Kernels for Clustering: Application to Power Distribution Graphs

Mar 18, 2025

Alfredo Oneto, Blazhe Gjorgiev, Giovanni Sansavini

Figure 1 for Wasserstein-based Kernels for Clustering: Application to Power Distribution Graphs

Figure 2 for Wasserstein-based Kernels for Clustering: Application to Power Distribution Graphs

Figure 3 for Wasserstein-based Kernels for Clustering: Application to Power Distribution Graphs

Figure 4 for Wasserstein-based Kernels for Clustering: Application to Power Distribution Graphs

Share this with someone who'll enjoy it:

Abstract:Many data clustering applications must handle objects that cannot be represented as vector data. In this context, the bag-of-vectors representation can be leveraged to describe complex objects through discrete distributions, and the Wasserstein distance can effectively measure the dissimilarity between them. Additionally, kernel methods can be used to embed data into feature spaces that are easier to analyze. Despite significant progress in data clustering, a method that simultaneously accounts for distributional and vectorial dissimilarity measures is still lacking. To tackle this gap, this work explores kernel methods and Wasserstein distance metrics to develop a computationally tractable clustering framework. The compositional properties of kernels allow the simultaneous handling of different metrics, enabling the integration of both vectors and discrete distributions for object representation. This approach is flexible enough to be applied in various domains, such as graph analysis and image processing. The framework consists of three main components. First, we efficiently approximate pairwise Wasserstein distances using multiple reference distributions. Second, we employ kernel functions based on Wasserstein distances and present ways of composing kernels to express different types of information. Finally, we use the kernels to cluster data and evaluate the quality of the results using scalable and distance-agnostic validity indices. A case study involving two datasets of 879 and 34,920 power distribution graphs demonstrates the framework's effectiveness and efficiency.

View paper on

Share this with someone who'll enjoy it:

Title:Wasserstein-based Kernels for Clustering: Application to Power Distribution Graphs

Paper and Code