Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nenad Mladenovic

Big-means: Less is More for K-means Clustering

Apr 14, 2022

Rustam Mussabayev, Nenad Mladenovic, Bassem Jarboui, Ravil Mussabayev

Figure 1 for Big-means: Less is More for K-means Clustering

Figure 2 for Big-means: Less is More for K-means Clustering

Figure 3 for Big-means: Less is More for K-means Clustering

Figure 4 for Big-means: Less is More for K-means Clustering

Abstract:K-means clustering plays a vital role in data mining. However, its performance drastically drops when applied to huge amounts of data. We propose a new heuristic that is built on the basis of regular K-means for faster and more accurate big data clustering using the "less is more" and MSSC decomposition approaches. The main advantage of the proposed algorithm is that it naturally turns the K-means local search into global one through the process of decomposition of the MSSC problem. On one hand, decomposition of the MSSC problem into smaller subproblems reduces the computational complexity and allows for their parallel processing. On the other hand, the MSSC decomposition provides a new method for the natural data-driven shaking of the incumbent solution while introducing a new neighborhood structure for the solution of the MSSC problem. This leads to a new heuristic that improves K-means in big data conditions. The scalability of the algorithm to big data can be easily adjusted by choosing the appropriate number of subproblems and their size. The proposed algorithm is both scalable and accurate. In our experiments it outperforms all recent state-of-the-art algorithms for the MSSC in terms of time as well as the solution quality.

Via

Access Paper or Ask Questions

Towards an intelligent VNS heuristic for the k-labelled spanning forest problem

Mar 05, 2015

Sergio Consoli, Josè Andrès Moreno Pèrez, Nenad Mladenovic

Abstract:In a currently ongoing project, we investigate a new possibility for solving the k-labelled spanning forest (kLSF) problem by an intelligent Variable Neighbourhood Search (Int-VNS) metaheuristic. In the kLSF problem we are given an undirected input graph G and an integer positive value k, and the aim is to find a spanning forest of G having the minimum number of connected components and the upper bound k on the number of labels to use. The problem is related to the minimum labelling spanning tree (MLST) problem, whose goal is to get the spanning tree of the input graph with the minimum number of labels, and has several applications in the real world, where one aims to ensure connectivity by means of homogeneous connections. The Int-VNS metaheuristic that we propose for the kLSF problem is derived from the promising intelligent VNS strategy recently proposed for the MLST problem, and integrates the basic VNS for the kLSF problem with other complementary approaches from machine learning, statistics and experimental algorithmics, in order to produce high-quality performance and to completely automate the resulting strategy.

* Computer Aided Systems Theory, pages 79-80 (2015)
* 2 pages, Fifteenth International Conference on Computer Aided Systems Theory (EUROCAST 2015), Las Palmas de Gran Canaria, Spain

Via

Access Paper or Ask Questions