Skip to Main content Skip to Navigation
Journal articles

On Strategies to Fix Degenerate k-means Solutions

Abstract : k-means is a benchmark algorithm used in cluster analysis. It belongs to the large category of heuristics based on location-allocation steps that alternately locate cluster centers and allocate data points to them until no further improvement is possible. Such heuristics are known to suffer from a phenomenon called degeneracy in which some of the clusters are empty. In this paper, we compare and propose a series of strategies to circumvent degenerate solutions during a k-means execution. Our computational experiments show that these strategies are effective, leading to better clustering solutions in the vast majority of the cases in which degeneracy appears in k-means. Moreover, we compare the use of our fixing strategies within k-means against the use of two initialization methods found in the literature. These results demonstrate how useful the proposed strategies can be, specially inside memorybased clustering algorithms.
Document type :
Journal articles
Complete list of metadata

https://hal-uphf.archives-ouvertes.fr/hal-03402183
Contributor : Kathleen Torck Connect in order to contact the contributor
Submitted on : Monday, October 25, 2021 - 3:34:40 PM
Last modification on : Wednesday, November 3, 2021 - 5:24:24 AM

Identifiers

Collections

Citation

Daniel Aloise, Nielsen Castelo Damasceno, Nenad Mladenovic, Daniel Nobre Pinheiro. On Strategies to Fix Degenerate k-means Solutions. Journal of Classification, Springer Verlag, 2017, 34 (2), pp.165-190. ⟨10.1007/s00357-017-9231-0⟩. ⟨hal-03402183⟩

Share

Metrics

Record views

3