Literature DB >> 28851838

Robust continuous clustering.

Sohil Atul Shah1, Vladlen Koltun2.   

Abstract

Clustering is a fundamental procedure in the analysis of scientific data. It is used ubiquitously across the sciences. Despite decades of research, existing clustering algorithms have limited effectiveness in high dimensions and often require tuning parameters for different domains and datasets. We present a clustering algorithm that achieves high accuracy across multiple domains and scales efficiently to high dimensions and large datasets. The presented algorithm optimizes a smooth continuous objective, which is based on robust statistics and allows heavily mixed clusters to be untangled. The continuous nature of the objective also allows clustering to be integrated as a module in end-to-end feature learning pipelines. We demonstrate this by extending the algorithm to perform joint clustering and dimensionality reduction by efficiently optimizing a continuous global objective. The presented approach is evaluated on large datasets of faces, hand-written digits, objects, newswire articles, sensor readings from the Space Shuttle, and protein expression levels. Our method achieves high accuracy across all datasets, outperforming the best prior algorithm by a factor of 3 in average rank.

Keywords:  clustering; data analysis; unsupervised learning

Year:  2017        PMID: 28851838      PMCID: PMC5603997          DOI: 10.1073/pnas.1700770114

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  8 in total

1.  Image clustering using local discriminant models and global integration.

Authors:  Yi Yang; Dong Xu; Feiping Nie; Shuicheng Yan; Yueting Zhuang
Journal:  IEEE Trans Image Process       Date:  2010-04-26       Impact factor: 10.856

2.  Scalable Nearest Neighbor Algorithms for High Dimensional Data.

Authors:  Marius Muja; David G Lowe
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2014-11       Impact factor: 6.226

3.  Clustering by passing messages between data points.

Authors:  Brendan J Frey; Delbert Dueck
Journal:  Science       Date:  2007-01-11       Impact factor: 47.728

4.  Sparse subspace clustering: algorithm, theory, and applications.

Authors:  Ehsan Elhamifar; René Vidal
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2013-11       Impact factor: 6.226

5.  A framework for feature selection in clustering.

Authors:  Daniela M Witten; Robert Tibshirani
Journal:  J Am Stat Assoc       Date:  2010-06-01       Impact factor: 5.033

6.  Splitting Methods for Convex Clustering.

Authors:  Eric C Chi; Kenneth Lange
Journal:  J Comput Graph Stat       Date:  2015-12-10       Impact factor: 2.302

7.  Self-Organizing Feature Maps Identify Proteins Critical to Learning in a Mouse Model of Down Syndrome.

Authors:  Clara Higuera; Katheleen J Gardiner; Krzysztof J Cios
Journal:  PLoS One       Date:  2015-06-25       Impact factor: 3.240

8.  Clustering cancer gene expression data: a comparative study.

Authors:  Marcilio C P de Souto; Ivan G Costa; Daniel S A de Araujo; Teresa B Ludermir; Alexander Schliep
Journal:  BMC Bioinformatics       Date:  2008-11-27       Impact factor: 3.169

  8 in total
  6 in total

1.  Massive data clustering by multi-scale psychological observations.

Authors:  Shusen Yang; Liwen Zhang; Chen Xu; Hanqiao Yu; Jianqing Fan; Zongben Xu
Journal:  Natl Sci Rev       Date:  2021-10-08       Impact factor: 17.275

2.  Adaptive Initialization Method for K-Means Algorithm.

Authors:  Jie Yang; Yu-Kai Wang; Xin Yao; Chin-Teng Lin
Journal:  Front Artif Intell       Date:  2021-11-25

3.  Discovering Synchronized Subsets of Sequences: A Large Scale Solution.

Authors:  Evangelos Sariyanidi; Casey J Zampella; Keith G Bartley; John D Herrington; Theodore D Satterthwaite; Robert T Schultz; Birkan Tunc
Journal:  Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit       Date:  2020-08-05

4.  Multiple states in ongoing neural activity in the rat visual cortex.

Authors:  Daichi Konno; Shinji Nishimoto; Takafumi Suzuki; Yuji Ikegaya; Nobuyoshi Matsumoto
Journal:  PLoS One       Date:  2021-08-26       Impact factor: 3.240

5.  Weak representation of awake/sleep states by local field potentials in aged mice.

Authors:  Daichi Konno; Yuji Ikegaya; Takuya Sasaki
Journal:  Sci Rep       Date:  2022-05-11       Impact factor: 4.996

6.  Clustering by measuring local direction centrality for data with heterogeneous density and weak connectivity.

Authors:  Dehua Peng; Zhipeng Gui; Dehe Wang; Yuncheng Ma; Zichen Huang; Yu Zhou; Huayi Wu
Journal:  Nat Commun       Date:  2022-09-16       Impact factor: 17.694

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.