Literature DB >> 33867617

IDENTIFYING THE NUMBER OF COMPONENTS IN GAUSSIAN MIXTURE MODELS USING NUMERICAL ALGEBRAIC GEOMETRY.

Sara Shirinkam1, Adel Alaeddini2, Elizabeth Gross3.   

Abstract

Using Gaussian mixture models for clustering is a statistically mature method for clustering in data science with numerous successful applications in science and engineering. The parameters for a Gaussian mixture model are typically estimated from training data using the iterative expectation-maximization algorithm, which requires the number of Gaussian components a priori. In this study, we propose two algorithms rooted in numerical algebraic geometry, namely an area-based algorithm and a local maxima algorithm, to identify the optimal number of components. The area-based algorithm transforms several Gaussian mixture models with varying number of components into sets of equivalent polynomial regression splines. Next, it uses homotopy continuation methods for evaluating the resulting splines to identify the number of components that results in the best fit. The local maxima algorithm forms a set of polynomials by fitting a smoothing spline to a kernel density estimate of the data. Next, it uses numerical algebraic geometry to solve the system of the first derivatives for finding the local maxima of the resulting smoothing spline, which estimates the number of mixture components. The local maxima algorithm also identifies the location of the centers of Gaussian components. Using a real-world case study in automotive manufacturing and multiple simulations, we compare the performance of the proposed algorithms with that of Akaike information criterion (AIC) and Bayesian information criterion (BIC), which are popular methods in the literature. We show the proposed algorithms are more robust than AIC and BIC when the Gaussian assumption is violated.

Entities:  

Keywords:  Mixture models; model-based clustering; numerical algebraic geometry; smoothing spline

Year:  2019        PMID: 33867617      PMCID: PMC8048412          DOI: 10.1142/s0219498820502047

Source DB:  PubMed          Journal:  J Algebra Appl        ISSN: 0219-4988            Impact factor:   0.736


  2 in total

1.  Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm.

Authors:  Y Zhang; M Brady; S Smith
Journal:  IEEE Trans Med Imaging       Date:  2001-01       Impact factor: 10.048

2.  Numerical algebraic geometry for model selection and its application to the life sciences.

Authors:  Elizabeth Gross; Brent Davis; Kenneth L Ho; Daniel J Bates; Heather A Harrington
Journal:  J R Soc Interface       Date:  2016-10       Impact factor: 4.118

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.