Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov-Nagumo Average.

Literature DB >> 33923177

A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov-Nagumo Average.

Abstract

Clustering is a major unsupervised learning algorithm and is widely applied in data mining and statistical data analyses. Typical examples include k-means, fuzzy c-means, and Gaussian mixture models, which are categorized into hard, soft, and model-based clusterings, respectively. We propose a new clustering, called Pareto clustering, based on the Kolmogorov-Nagumo average, which is defined by a survival function of the Pareto distribution. The proposed algorithm incorporates all the aforementioned clusterings plus maximum-entropy clustering. We introduce a probabilistic framework for the proposed method, in which the underlying distribution to give consistency is discussed. We build the minorize-maximization algorithm to estimate the parameters in Pareto clustering. We compare the performance with existing methods in simulation studies and in benchmark dataset analyses to demonstrate its highly practical utilities.

Entities: Chemical Disease Species

Keywords: Gaussian mixture model; Kolmogorov–Nagumo average; Pareto distribution; fuzzy-c; generalized energy function; k-means

Year: 2021 PMID： 33923177 DOI： 10.3390/e23050518

Source DB: PubMed Journal: Entropy (Basel) ISSN： 1099-4300 Impact factor: 2.524

7 in total

A Unified Formulation of k-Means, Fuzzy c-Means and Gaussian Mixture Model by the Kolmogorov-Nagumo Average.

1. Statistical mechanics and phase transitions in clustering.

2. General C-means clustering model.

3. Genetic-based EM algorithm for learning Gaussian mixture models.

4. Genetic K-means algorithm.

5. Robust Clustering Method in the Presence of Scattered Observations.

6. mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models.

7. Quasi-linear score for capturing heterogeneous structure in biomarkers.