Literature DB >> 17063684

Evaluation of stability of k-means cluster ensembles with respect to random initialization.

Ludmila I Kuncheva1, Dmitry P Vetrov.   

Abstract

Many clustering algorithms, including cluster ensembles, rely on a random component. Stability of the results across different runs is considered to be an asset of the algorithm. The cluster ensembles considered here are based on k-means clusterers. Each clusterer is assigned a random target number of clusters, k and is started from a random initialization. Here, we use 10 artificial and 10 real data sets to study ensemble stability with respect to random k, and random initialization. The data sets were chosen to have a small number of clusters (two to seven) and a moderate number of data points (up to a few hundred). Pairwise stability is defined as the adjusted Rand index between pairs of clusterers in the ensemble, averaged across all pairs. Nonpairwise stability is defined as the entropy of the consensus matrix of the ensemble. An experimental comparison with the stability of the standard k-means algorithm was carried out for k from 2 to 20. The results revealed that ensembles are generally more stable, markedly so for larger k. To establish whether stability can serve as a cluster validity index, we first looked at the relationship between stability and accuracy with respect to the number of clusters, k. We found that such a relationship strongly depends on the data set, varying from almost perfect positive correlation (0.97, for the glass data) to almost perfect negative correlation (-0.93, for the crabs data). We propose a new combined stability index to be the sum of the pairwise individual and ensemble stabilities. This index was found to correlate better with the ensemble accuracy. Following the hypothesis that a point of stability of a clustering algorithm corresponds to a structure found in the data, we used the stability measures to pick the number of clusters. The combined stability index gave best results.

Mesh:

Year:  2006        PMID: 17063684     DOI: 10.1109/TPAMI.2006.226

Source DB:  PubMed          Journal:  IEEE Trans Pattern Anal Mach Intell        ISSN: 0098-5589            Impact factor:   6.226


  8 in total

1.  A hierarchical method for whole-brain connectivity-based parcellation.

Authors:  David Moreno-Dominguez; Alfred Anwander; Thomas R Knösche
Journal:  Hum Brain Mapp       Date:  2014-04-17       Impact factor: 5.038

2.  Ensemble Clustering using Semidefinite Programming with Applications.

Authors:  Vikas Singh; Lopamudra Mukherjee; Jiming Peng; Jinhui Xu
Journal:  Mach Learn       Date:  2010-05       Impact factor: 2.940

3.  The threshold bootstrap clustering: a new approach to find families or transmission clusters within molecular quasispecies.

Authors:  Mattia C F Prosperi; Andrea De Luca; Simona Di Giambenedetto; Laura Bracciale; Massimiliano Fabbiani; Roberto Cauda; Marco Salemi
Journal:  PLoS One       Date:  2010-10-25       Impact factor: 3.240

4.  Finding reproducible cluster partitions for the k-means algorithm.

Authors:  Paulo J G Lisboa; Terence A Etchells; Ian H Jarman; Simon J Chambers
Journal:  BMC Bioinformatics       Date:  2013-01-14       Impact factor: 3.169

5.  Autoencoder-based cluster ensembles for single-cell RNA-seq data analysis.

Authors:  Thomas A Geddes; Taiyun Kim; Lihao Nan; James G Burchfield; Jean Y H Yang; Dacheng Tao; Pengyi Yang
Journal:  BMC Bioinformatics       Date:  2019-12-24       Impact factor: 3.169

6.  Sc-GPE: A Graph Partitioning-Based Cluster Ensemble Method for Single-Cell.

Authors:  Xiaoshu Zhu; Jian Li; Hong-Dong Li; Miao Xie; Jianxin Wang
Journal:  Front Genet       Date:  2020-12-15       Impact factor: 4.599

7.  MULTI-K: accurate classification of microarray subtypes using ensemble k-means clustering.

Authors:  Eun-Youn Kim; Seon-Young Kim; Daniel Ashlock; Dougu Nam
Journal:  BMC Bioinformatics       Date:  2009-08-22       Impact factor: 3.169

8.  Convalescing Cluster Configuration Using a Superlative Framework.

Authors:  R Sabitha; S Karthik
Journal:  ScientificWorldJournal       Date:  2015-10-12
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.