Abstract:Existing approaches remain largely constrained by traditional distance metrics, limiting their effectiveness in handling random data. In this work, we introduce the first k-means variant in the literature that operates within a probabilistic metric space, replacing conventional distance measures with a well-defined distance distribution function. This pioneering approach enables more flexible and robust clustering in both deterministic and random datasets, establishing a new foundation for clustering in stochastic environments. By adopting a probabilistic perspective, our method not only introduces a fresh paradigm but also establishes a rigorous theoretical framework that is expected to serve as a key reference for future clustering research involving random data. Extensive experiments on diverse real and synthetic datasets assess our model's effectiveness using widely recognized evaluation metrics, including Silhouette, Davies-Bouldin, Calinski Harabasz, the adjusted Rand index, and distortion. Comparative analyses against established methods such as k-means++, fuzzy c-means, and kernel probabilistic k-means demonstrate the superior performance of our proposed random normed k-means (RNKM) algorithm. Notably, RNKM exhibits a remarkable ability to identify nonlinearly separable structures, making it highly effective in complex clustering scenarios. These findings position RNKM as a groundbreaking advancement in clustering research, offering a powerful alternative to traditional techniques while addressing a long-standing gap in the literature. By bridging probabilistic metrics with clustering, this study provides a foundational reference for future developments and opens new avenues for advanced data analysis in dynamic, data-driven applications.