Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Thibault Marette

Label-consistent clustering for evolving data

Dec 17, 2025

Ameet Gadekar, Aristides Gionis, Thibault Marette

Abstract:Data analysis often involves an iterative process, where solutions must be continuously refined in response to new data. Typically, as new data becomes available, an existing solution must be updated to incorporate the latest information. In addition to seeking a high-quality solution for the task at hand, it is also crucial to ensure consistency by minimizing drastic changes from previous solutions. Applying this approach across many iterations, ensures that the solution evolves gradually and smoothly. In this paper, we study the above problem in the context of clustering, specifically focusing on the $k$-center problem. More precisely, we study the following problem: Given a set of points $X$, parameters $k$ and $b$, and a prior clustering solution $H$ for $X$, our goal is to compute a new solution $C$ for $X$, consisting of $k$ centers, which minimizes the clustering cost while introducing at most $b$ changes from $H$. We refer to this problem as label-consistent $k$-center, and we propose two constant-factor approximation algorithms for it. We complement our theoretical findings with an experimental evaluation demonstrating the effectiveness of our methods on real-world datasets.

* 26 pages

Via

Access Paper or Ask Questions

Visualizing Overlapping Biclusterings and Boolean Matrix Factorizations

Jul 14, 2023

Thibault Marette, Pauli Miettinen, Stefan Neumann

Figure 1 for Visualizing Overlapping Biclusterings and Boolean Matrix Factorizations

Figure 2 for Visualizing Overlapping Biclusterings and Boolean Matrix Factorizations

Figure 3 for Visualizing Overlapping Biclusterings and Boolean Matrix Factorizations

Figure 4 for Visualizing Overlapping Biclusterings and Boolean Matrix Factorizations

Abstract:Finding (bi-)clusters in bipartite graphs is a popular data analysis approach. Analysts typically want to visualize the clusters, which is simple as long as the clusters are disjoint. However, many modern algorithms find overlapping clusters, making visualization more complicated. In this paper, we study the problem of visualizing \emph{a given clustering} of overlapping clusters in bipartite graphs and the related problem of visualizing Boolean Matrix Factorizations. We conceptualize three different objectives that any good visualization should satisfy: (1) proximity of cluster elements, (2) large consecutive areas of elements from the same cluster, and (3) large uninterrupted areas in the visualization, regardless of the cluster membership. We provide objective functions that capture these goals and algorithms that optimize these objective functions. Interestingly, in experiments on real-world datasets, we find that the best trade-off between these competing goals is achieved by a novel heuristic, which locally aims to place rows and columns with similar cluster membership next to each other.

* 17 pages, 7 figures, to be published in ECML PKDD 2023

Via

Access Paper or Ask Questions