Literature DB >> 17425640

Determining the number of clusters using the weighted gap statistic.

Mingjin Yan1, Keying Ye.   

Abstract

Estimating the number of clusters in a data set is a crucial step in cluster analysis. In this article, motivated by the gap method (Tibshirani, Walther, and Hastie, 2001, Journal of the Royal Statistical Society B63, 411-423), we propose the weighted gap and the difference of difference-weighted (DD-weighted) gap methods for estimating the number of clusters in data using the weighted within-clusters sum of errors: a measure of the within-clusters homogeneity. In addition, we propose a "multilayer" clustering approach, which is shown to be more accurate than the original gap method, particularly in detecting the nested cluster structure of the data. The methods are applicable when the input data contain continuous measurements and can be used with any clustering method. Simulation studies and real data are investigated and compared among these proposed methods as well as with the original gap method.

Mesh:

Year:  2007        PMID: 17425640     DOI: 10.1111/j.1541-0420.2007.00784.x

Source DB:  PubMed          Journal:  Biometrics        ISSN: 0006-341X            Impact factor:   2.571


  28 in total

1.  Clustering of gene expression data based on shape similarity.

Authors:  Travis J Hestilow; Yufei Huang
Journal:  EURASIP J Bioinform Syst Biol       Date:  2009-04-23

2.  Dynamic shifts in brain network activation during supracapacity working memory task performance.

Authors:  Jared X Van Snellenberg; Mark Slifstein; Christina Read; Jochen Weber; Judy L Thompson; Tor D Wager; Daphna Shohamy; Anissa Abi-Dargham; Edward E Smith
Journal:  Hum Brain Mapp       Date:  2014-11-24       Impact factor: 5.038

3.  Examining the sublineage structure of Mycobacterium tuberculosis complex strains with multiple-biomarker tensors.

Authors:  Cagri Ozcaglar; Amina Shabbeer; Scott Vandenberg; Bülent Yener; Kristin P Bennett
Journal:  Proceedings (IEEE Int Conf Bioinformatics Biomed)       Date:  2010

4.  Altered Cortical Ensembles in Mouse Models of Schizophrenia.

Authors:  Jordan P Hamm; Darcy S Peterka; Joseph A Gogos; Rafael Yuste
Journal:  Neuron       Date:  2017-04-05       Impact factor: 17.173

5.  Determination of genetic structure of germplasm collections: are traditional hierarchical clustering methods appropriate for molecular marker data?

Authors:  T L Odong; J van Heerwaarden; J Jansen; T J L van Hintum; F A van Eeuwijk
Journal:  Theor Appl Genet       Date:  2011-04-07       Impact factor: 5.699

6.  Online Phenotype Discovery based on Minimum Classification Error Model.

Authors:  Zheng Yin; Xiaobo Zhou; Youxian Sun; Stephen T C Wong
Journal:  Pattern Recognit       Date:  2009-04       Impact factor: 7.740

7.  Algorithmic approaches to aid species' delimitation in multidimensional morphospace.

Authors:  Thomas H G Ezard; Paul N Pearson; Andy Purvis
Journal:  BMC Evol Biol       Date:  2010-06-11       Impact factor: 3.260

8.  Identifying Regions Based on Flexible User Defined Constraints.

Authors:  David C Folch; Seth E Spielman
Journal:  Int J Geogr Inf Sci       Date:  2014       Impact factor: 4.186

9.  Integrated genetic and epigenetic analysis of childhood acute lymphoblastic leukemia.

Authors:  Maria E Figueroa; Shann-Ching Chen; Anna K Andersson; Letha A Phillips; Yushan Li; Jason Sotzen; Mondira Kundu; James R Downing; Ari Melnick; Charles G Mullighan
Journal:  J Clin Invest       Date:  2013-06-10       Impact factor: 14.808

10.  Modulated modularity clustering as an exploratory tool for functional genomic inference.

Authors:  Eric A Stone; Julien F Ayroles
Journal:  PLoS Genet       Date:  2009-05-08       Impact factor: 5.917

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.