Literature DB >> 15130251

Stability-based validation of clustering solutions.

Tilman Lange1, Volker Roth, Mikio L Braun, Joachim M Buhmann.   

Abstract

Data clustering describes a set of frequently employed techniques in exploratory data analysis to extract "natural" group structure in data. Such groupings need to be validated to separate the signal in the data from spurious structure. In this context, finding an appropriate number of clusters is a particularly important model selection question. We introduce a measure of cluster stability to assess the validity of a cluster model. This stability measure quantifies the reproducibility of clustering solutions on a second sample, and it can be interpreted as a classification risk with regard to class labels produced by a clustering algorithm. The preferred number of clusters is determined by minimizing this classification risk as a function of the number of clusters. Convincing results are achieved on simulated as well as gene expression data sets. Comparisons to other methods demonstrate the competitive performance of our method and its suitability as a general validation tool for clustering solutions in real-world problems.

Mesh:

Year:  2004        PMID: 15130251     DOI: 10.1162/089976604773717621

Source DB:  PubMed          Journal:  Neural Comput        ISSN: 0899-7667            Impact factor:   2.026


  53 in total

1.  Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models.

Authors:  Han Liu; Kathryn Roeder; Larry Wasserman
Journal:  Adv Neural Inf Process Syst       Date:  2010-12-31

2.  Search for patterns of functional specificity in the brain: a nonparametric hierarchical Bayesian model for group fMRI data.

Authors:  Danial Lashkari; Ramesh Sridharan; Edward Vul; Po-Jang Hsieh; Nancy Kanwisher; Polina Golland
Journal:  Neuroimage       Date:  2011-08-22       Impact factor: 6.556

3.  The use of clustering algorithms in critical care research to unravel patient heterogeneity.

Authors:  José Castela Forte; Anders Perner; Iwan C C van der Horst
Journal:  Intensive Care Med       Date:  2019-05-06       Impact factor: 17.440

4.  High-dimensional longitudinal classification with the multinomial fused lasso.

Authors:  Samrachana Adhikari; Fabrizio Lecci; James T Becker; Brian W Junker; Lewis H Kuller; Oscar L Lopez; Ryan J Tibshirani
Journal:  Stat Med       Date:  2019-01-30       Impact factor: 2.373

5.  Dynamic Cortical Connectivity during General Anesthesia in Surgical Patients.

Authors:  Phillip E Vlisides; Duan Li; Mackenzie Zierau; Andrew P Lapointe; Ka I Ip; Amy M McKinney; George A Mashour
Journal:  Anesthesiology       Date:  2019-06       Impact factor: 7.892

6.  HYDRA: Revealing heterogeneity of imaging and genetic patterns through a multiple max-margin discriminative analysis framework.

Authors:  Erdem Varol; Aristeidis Sotiras; Christos Davatzikos
Journal:  Neuroimage       Date:  2016-02-23       Impact factor: 6.556

7.  Differentially categorized structural brain hubs are involved in different microstructural, functional, and cognitive characteristics and contribute to individual identification.

Authors:  Xindi Wang; Qixiang Lin; Mingrui Xia; Yong He
Journal:  Hum Brain Mapp       Date:  2018-01-04       Impact factor: 5.038

8.  College Students' Perception of Current and Projected 30-Year Cardiovascular Disease Risk Using Cluster Analysis with Internal Validation.

Authors:  Dieu-My T Tran
Journal:  J Community Health       Date:  2019-06

9.  Identifying prototypical components in behaviour using clustering algorithms.

Authors:  Elke Braun; Bart Geurten; Martin Egelhaaf
Journal:  PLoS One       Date:  2010-02-22       Impact factor: 3.240

10.  A highly efficient multi-core algorithm for clustering extremely large datasets.

Authors:  Johann M Kraus; Hans A Kestler
Journal:  BMC Bioinformatics       Date:  2010-04-06       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.