Literature DB >> 26257451

Hierarchical Clustering With Prototypes via Minimax Linkage.

Jacob Bien1, Robert Tibshirani2.   

Abstract

Agglomerative hierarchical clustering is a popular class of methods for understanding the structure of a dataset. The nature of the clustering depends on the choice of linkage-that is, on how one measures the distance between clusters. In this article we investigate minimax linkage, a recently introduced but little-studied linkage. Minimax linkage is unique in naturally associating a prototype chosen from the original dataset with every interior node of the dendrogram. These prototypes can be used to greatly enhance the interpretability of a hierarchical clustering. Furthermore, we prove that minimax linkage has a number of desirable theoretical properties; for example, minimax-linkage dendrograms cannot have inversions (unlike centroid linkage) and is robust against certain perturbations of a dataset. We provide an efficient implementation and illustrate minimax linkage's strengths as a data analysis and visualization tool on a study of words from encyclopedia articles and on a dataset of images of human faces.

Entities:  

Keywords:  Agglomerative; Dendrogram; Unsupervised learning

Year:  2011        PMID: 26257451      PMCID: PMC4527350          DOI: 10.1198/jasa.2011.tm10183

Source DB:  PubMed          Journal:  J Am Stat Assoc        ISSN: 0162-1459            Impact factor:   5.033


  7 in total

1.  CLUSTAG: hierarchical clustering and graph methods for selecting tag SNPs.

Authors:  S I Ao; Kevin Yip; Michael Ng; David Cheung; Pui-Yee Fong; Ian Melhado; Pak C Sham
Journal:  Bioinformatics       Date:  2004-12-07       Impact factor: 6.937

2.  Hybrid hierarchical clustering with applications to microarray data.

Authors:  Hugh Chipman; Robert Tibshirani
Journal:  Biostatistics       Date:  2005-11-21       Impact factor: 5.899

3.  Hausdorff clustering.

Authors:  Nicolas Basalto; Roberto Bellotti; Francesco De Carlo; Paolo Facchi; Ester Pantaleo; Saverio Pascazio
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2008-10-28

4.  A framework for feature selection in clustering.

Authors:  Daniela M Witten; Robert Tibshirani
Journal:  J Am Stat Assoc       Date:  2010-06-01       Impact factor: 5.033

5.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.

Authors:  U Alon; N Barkai; D A Notterman; K Gish; S Ybarra; D Mack; A J Levine
Journal:  Proc Natl Acad Sci U S A       Date:  1999-06-08       Impact factor: 11.205

6.  Cluster analysis and display of genome-wide expression patterns.

Authors:  M B Eisen; P T Spellman; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-08       Impact factor: 11.205

7.  Gene expression correlates of clinical prostate cancer behavior.

Authors:  Dinesh Singh; Phillip G Febbo; Kenneth Ross; Donald G Jackson; Judith Manola; Christine Ladd; Pablo Tamayo; Andrew A Renshaw; Anthony V D'Amico; Jerome P Richie; Eric S Lander; Massimo Loda; Philip W Kantoff; Todd R Golub; William R Sellers
Journal:  Cancer Cell       Date:  2002-03       Impact factor: 31.743

  7 in total
  20 in total

1.  SCALPEL: EXTRACTING NEURONS FROM CALCIUM IMAGING DATA.

Authors:  Ashley Petersen; Noah Simon; Daniela Witten
Journal:  Ann Appl Stat       Date:  2018-11-13       Impact factor: 2.083

2.  Sparse regression and marginal testing using cluster prototypes.

Authors:  Stephen Reid; Robert Tibshirani
Journal:  Biostatistics       Date:  2015-11-27       Impact factor: 5.899

3.  A robust approach for identifying differentially abundant features in metagenomic samples.

Authors:  Michael B Sohn; Ruofei Du; Lingling An
Journal:  Bioinformatics       Date:  2015-03-19       Impact factor: 6.937

4.  Statistical modelling of citation exchange between statistics journals.

Authors:  Cristiano Varin; Manuela Cattelan; David Firth
Journal:  J R Stat Soc Ser A Stat Soc       Date:  2015-11-03       Impact factor: 2.483

Review 5.  Statistical Approaches to Address Multi-Pollutant Mixtures and Multiple Exposures: the State of the Science.

Authors:  Massimo Stafoggia; Susanne Breitner; Regina Hampel; Xavier Basagaña
Journal:  Curr Environ Health Rep       Date:  2017-12

6.  Quantifying yeast colony morphologies with feature engineering from time-lapse photography.

Authors:  Andy Goldschmidt; James Kunert-Graf; Adrian C Scott; Zhihao Tan; Aimée M Dudley; J Nathan Kutz
Journal:  Sci Data       Date:  2022-05-17       Impact factor: 8.501

7.  Robust Analysis of Phylogenetic Tree Space.

Authors:  Martin R Smith
Journal:  Syst Biol       Date:  2022-08-10       Impact factor: 9.160

8.  CUSTOMIZED TRAINING WITH AN APPLICATION TO MASS SPECTROMETRIC IMAGING OF CANCER TISSUE.

Authors:  Scott Powers; Trevor Hastie; Robert Tibshirani
Journal:  Ann Appl Stat       Date:  2016-01-28       Impact factor: 2.083

9.  Expanding TNM for lung cancer through machine learning.

Authors:  Matthew Hueman; Huan Wang; Zhenqiu Liu; Donald Henson; Cuong Nguyen; Dean Park; Li Sheng; Dechang Chen
Journal:  Thorac Cancer       Date:  2021-03-13       Impact factor: 3.500

10.  Learning the Structure of Biomedical Relationships from Unstructured Text.

Authors:  Bethany Percha; Russ B Altman
Journal:  PLoS Comput Biol       Date:  2015-07-28       Impact factor: 4.475

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.