Literature DB >> 18537949

Clustering in the presence of scatter.

Ranjan Maitra1, Ivan P Ramler.   

Abstract

SUMMARY: A new methodology is proposed for clustering datasets in the presence of scattered observations. Scattered observations are defined as unlike any other, so traditional approaches that force them into groups can lead to erroneous conclusions. Our suggested approach is a scheme which, under assumption of homogeneous spherical clusters, iteratively builds cores around their centers and groups points within each core while identifying points outside as scatter. In the absence of scatter, the algorithm reduces to k-means. We also provide methodology to initialize the algorithm and to estimate the number of clusters in the dataset. Results in experimental situations show excellent performance, especially when clusters are elliptically symmetric. The methodology is applied to the analysis of the United States Environmental Protection Agency's Toxic Release Inventory reports on industrial releases of mercury for the year 2000.

Entities:  

Mesh:

Year:  2008        PMID: 18537949     DOI: 10.1111/j.1541-0420.2008.01064.x

Source DB:  PubMed          Journal:  Biometrics        ISSN: 0006-341X            Impact factor:   2.571


  6 in total

1.  Merging K-means with hierarchical clustering for identifying general-shaped groups.

Authors:  Anna D Peterson; Arka P Ghosh; Ranjan Maitra
Journal:  Stat (Int Stat Inst)       Date:  2018-01-17

2.  Integrative Sparse K-Means With Overlapping Group Lasso in Genomic Applications for Disease Subtype Discovery.

Authors:  Zhiguang Huo; George Tseng
Journal:  Ann Appl Stat       Date:  2017-07-20       Impact factor: 2.083

3.  MplusAutomation: An R Package for Facilitating Large-Scale Latent Variable Analyses in Mplus.

Authors:  Michael N Hallquist; Joshua F Wiley
Journal:  Struct Equ Modeling       Date:  2018-01-19       Impact factor: 6.125

4.  Comparative Pathway Integrator: A Framework of Meta-Analytic Integration of Multiple Transcriptomic Studies for Consensual and Differential Pathway Analysis.

Authors:  Xiangrui Zeng; Wei Zong; Chien-Wei Lin; Zhou Fang; Tianzhou Ma; David A Lewis; John F Enwright; George C Tseng
Journal:  Genes (Basel)       Date:  2020-06-24       Impact factor: 4.096

5.  Integrative clustering of multi-level omics data for disease subtype discovery using sequential double regularization.

Authors:  Sunghwan Kim; Steffi Oesterreich; Seyoung Kim; Yongseok Park; George C Tseng
Journal:  Biostatistics       Date:  2016-08-22       Impact factor: 5.899

6.  CLAG: an unsupervised non hierarchical clustering algorithm handling biological data.

Authors:  Linda Dib; Alessandra Carbone
Journal:  BMC Bioinformatics       Date:  2012-08-08       Impact factor: 3.169

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.