Literature DB >> 20953302

Combining Mixture Components for Clustering.

Jean-Patrick Baudry1, Adrian E Raftery, Gilles Celeux, Kenneth Lo, Raphaël Gottardo.   

Abstract

Model-based clustering consists of fitting a mixture model to data and identifying each cluster with one of its components. Multivariate normal distributions are typically used. The number of clusters is usually determined from the data, often using BIC. In practice, however, individual clusters can be poorly fitted by Gaussian distributions, and in that case model-based clustering tends to represent one non-Gaussian cluster by a mixture of two or more Gaussian distributions. If the number of mixture components is interpreted as the number of clusters, this can lead to overestimation of the number of clusters. This is because BIC selects the number of mixture components needed to provide a good approximation to the density, rather than the number of clusters as such. We propose first selecting the total number of Gaussian mixture components, K, using BIC and then combining them hierarchically according to an entropy criterion. This yields a unique soft clustering for each number of clusters less than or equal to K. These clusterings can be compared on substantive grounds, and we also describe an automatic way of selecting the number of clusters via a piecewise linear regression fit to the rescaled entropy plot. We illustrate the method with simulated data and a flow cytometry dataset. Supplemental Materials are available on the journal Web site and described at the end of the paper.

Entities:  

Year:  2010        PMID: 20953302      PMCID: PMC2953822          DOI: 10.1198/jcgs.2010.08111

Source DB:  PubMed          Journal:  J Comput Graph Stat        ISSN: 1061-8600            Impact factor:   2.302


  3 in total

1.  Automated gating of flow cytometry data via robust model-based clustering.

Authors:  Kenneth Lo; Ryan Remy Brinkman; Raphael Gottardo
Journal:  Cytometry A       Date:  2008-04       Impact factor: 4.355

2.  Mixture models with multiple levels, with application to the analysis of multifactor gene expression data.

Authors:  Rebecka Jörnsten; Sündüz Keleş
Journal:  Biostatistics       Date:  2008-02-05       Impact factor: 5.899

3.  High-content flow cytometry and temporal data analysis for defining a cellular signature of graft-versus-host disease.

Authors:  Ryan Remy Brinkman; Maura Gasparetto; Shang-Jung Jessica Lee; Albert J Ribickas; Janelle Perkins; William Janssen; Renee Smiley; Clay Smith
Journal:  Biol Blood Marrow Transplant       Date:  2007-04-06       Impact factor: 5.742

  3 in total
  26 in total

1.  Rapid cell population identification in flow cytometry data.

Authors:  Nima Aghaeepour; Radina Nikolic; Holger H Hoos; Ryan R Brinkman
Journal:  Cytometry A       Date:  2011-01       Impact factor: 4.355

2.  Time-resolved transcriptome and proteome landscape of human regulatory T cell (Treg) differentiation reveals novel regulators of FOXP3.

Authors:  Angelika Schmidt; Francesco Marabita; Narsis A Kiani; Catharina C Gross; Henrik J Johansson; Szabolcs Éliás; Sini Rautio; Matilda Eriksson; Sunjay Jude Fernandes; Gilad Silberberg; Ubaid Ullah; Urvashi Bhatia; Harri Lähdesmäki; Janne Lehtiö; David Gomez-Cabrero; Heinz Wiendl; Riitta Lahesmaa; Jesper Tegnér
Journal:  BMC Biol       Date:  2018-05-07       Impact factor: 7.431

3.  Merging K-means with hierarchical clustering for identifying general-shaped groups.

Authors:  Anna D Peterson; Arka P Ghosh; Ranjan Maitra
Journal:  Stat (Int Stat Inst)       Date:  2018-01-17

4.  EN1 Is a Transcriptional Dependency in Triple-Negative Breast Cancer Associated with Brain Metastasis.

Authors:  Guillermo Peluffo; Ashim Subedee; Nicholas W Harper; Natalie Kingston; Bojana Jovanović; Felipe Flores; Laura E Stevens; Francisco Beca; Anne Trinh; Chandra Sekhar Reddy Chilamakuri; Evangelia K Papachristou; Katherine Murphy; Ying Su; Andriy Marusyk; Clive S D'Santos; Oscar M Rueda; Andrew H Beck; Carlos Caldas; Jason S Carroll; Kornelia Polyak
Journal:  Cancer Res       Date:  2019-06-25       Impact factor: 12.701

5.  Quantitative analysis of mitochondrial morphology and membrane potential in living cells using high-content imaging, machine learning, and morphological binning.

Authors:  Anthony P Leonard; Robert B Cameron; Jaime L Speiser; Bethany J Wolf; Yuri K Peterson; Rick G Schnellmann; Craig C Beeson; Bärbel Rohrer
Journal:  Biochim Biophys Acta       Date:  2014-11-13

6.  Latent time-varying factors in longitudinal analysis: a linear mixed hidden Markov model for heart rates.

Authors:  Francesco Lagona; Dmitri Jdanov; Maria Shkolnikova
Journal:  Stat Med       Date:  2014-06-02       Impact factor: 2.373

7.  Real-Time Point Process Filter for Multidimensional Decoding Problems Using Mixture Models.

Authors:  Mohammad Reza Rezaei; Kensuke Arai; Loren M Frank; Uri T Eden; Ali Yousefi
Journal:  J Neurosci Methods       Date:  2020-11-21       Impact factor: 2.390

8.  Parametric modeling of cellular state transitions as measured with flow cytometry.

Authors:  Hsiu J Ho; Tsung I Lin; Hannah H Chang; Steven B Haase; Sui Huang; Saumyadipta Pyne
Journal:  BMC Bioinformatics       Date:  2012-04-12       Impact factor: 3.169

9.  A computational framework to emulate the human perspective in flow cytometric data analysis.

Authors:  Surajit Ray; Saumyadipta Pyne
Journal:  PLoS One       Date:  2012-05-01       Impact factor: 3.240

10.  The emergence of three human development clubs.

Authors:  Sebastian Vollmer; Hajo Holzmann; Florian Ketterer; Stephan Klasen; David Canning
Journal:  PLoS One       Date:  2013-03-13       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.