Literature DB >> 33923177

A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov-Nagumo Average.

Osamu Komori1, Shinto Eguchi2.   

Abstract

Clustering is a major unsupervised learning algorithm and is widely applied in data mining and statistical data analyses. Typical examples include k-means, fuzzy c-means, and Gaussian mixture models, which are categorized into hard, soft, and model-based clusterings, respectively. We propose a new clustering, called Pareto clustering, based on the Kolmogorov-Nagumo average, which is defined by a survival function of the Pareto distribution. The proposed algorithm incorporates all the aforementioned clusterings plus maximum-entropy clustering. We introduce a probabilistic framework for the proposed method, in which the underlying distribution to give consistency is discussed. We build the minorize-maximization algorithm to estimate the parameters in Pareto clustering. We compare the performance with existing methods in simulation studies and in benchmark dataset analyses to demonstrate its highly practical utilities.

Entities:  

Keywords:  Gaussian mixture model; Kolmogorov–Nagumo average; Pareto distribution; fuzzy-c; generalized energy function; k-means

Year:  2021        PMID: 33923177     DOI: 10.3390/e23050518

Source DB:  PubMed          Journal:  Entropy (Basel)        ISSN: 1099-4300            Impact factor:   2.524


  7 in total

1.  Statistical mechanics and phase transitions in clustering.

Authors: 
Journal:  Phys Rev Lett       Date:  1990-08-20       Impact factor: 9.161

2.  General C-means clustering model.

Authors:  Jian Yu
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2005-08       Impact factor: 6.226

3.  Genetic-based EM algorithm for learning Gaussian mixture models.

Authors:  Franz Pernkopf; Djamel Bouchaffra
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2005-08       Impact factor: 6.226

4.  Genetic K-means algorithm.

Authors:  K Krishna; M Narasimha Murty
Journal:  IEEE Trans Syst Man Cybern B Cybern       Date:  1999

5.  Robust Clustering Method in the Presence of Scattered Observations.

Authors:  Akifumi Notsu; Shinto Eguchi
Journal:  Neural Comput       Date:  2016-03-04       Impact factor: 2.026

6.  mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models.

Authors:  Luca Scrucca; Michael Fop; T Brendan Murphy; Adrian E Raftery
Journal:  R J       Date:  2016-08       Impact factor: 3.984

7.  Quasi-linear score for capturing heterogeneous structure in biomarkers.

Authors:  Katsuhiro Omae; Osamu Komori; Shinto Eguchi
Journal:  BMC Bioinformatics       Date:  2017-06-19       Impact factor: 3.169

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.