Literature DB >> 34901837

Adaptive Initialization Method for K-Means Algorithm.

Jie Yang1, Yu-Kai Wang1, Xin Yao2,3, Chin-Teng Lin1.   

Abstract

The K-means algorithm is a widely used clustering algorithm that offers simplicity and efficiency. However, the traditional K-means algorithm uses a random method to determine the initial cluster centers, which make clustering results prone to local optima and then result in worse clustering performance. In this research, we propose an adaptive initialization method for the K-means algorithm (AIMK) which can adapt to the various characteristics in different datasets and obtain better clustering performance with stable results. For larger or higher-dimensional datasets, we even leverage random sampling in AIMK (name as AIMK-RS) to reduce the time complexity. 22 real-world datasets were applied for performance comparisons. The experimental results show AIMK and AIMK-RS outperform the current initialization methods and several well-known clustering algorithms. Specifically, AIMK-RS can significantly reduce the time complexity to O (n). Moreover, we exploit AIMK to initialize K-medoids and spectral clustering, and better performance is also explored. The above results demonstrate superior performance and good scalability by AIMK or AIMK-RS. In the future, we would like to apply AIMK to more partition-based clustering algorithms to solve real-life practical problems.
Copyright © 2021 Yang, Wang, Yao and Lin.

Entities:  

Keywords:  adaptive; clustering; initial cluster centers; initialization method; k-means

Year:  2021        PMID: 34901837      PMCID: PMC8656690          DOI: 10.3389/frai.2021.740817

Source DB:  PubMed          Journal:  Front Artif Intell        ISSN: 2624-8212


  5 in total

1.  Machine learning. Clustering by fast search and find of density peaks.

Authors:  Alex Rodriguez; Alessandro Laio
Journal:  Science       Date:  2014-06-27       Impact factor: 47.728

2.  Hierarchical clustering schemes.

Authors:  S C Johnson
Journal:  Psychometrika       Date:  1967-09       Impact factor: 2.500

3.  Robust continuous clustering.

Authors:  Sohil Atul Shah; Vladlen Koltun
Journal:  Proc Natl Acad Sci U S A       Date:  2017-08-29       Impact factor: 11.205

4.  An Initialization Method Based on Hybrid Distance for k-Means Algorithm.

Authors:  Jie Yang; Yan Ma; Xiangfen Zhang; Shunbao Li; Yuping Zhang
Journal:  Neural Comput       Date:  2017-09-28       Impact factor: 2.026

Review 5.  Challenges in unsupervised clustering of single-cell RNA-seq data.

Authors:  Vladimir Yu Kiselev; Tallulah S Andrews; Martin Hemberg
Journal:  Nat Rev Genet       Date:  2019-05       Impact factor: 53.242

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.