Literature DB >> 19875860

Modeling and visualizing uncertainty in gene expression clusters using dirichlet process mixtures.

Carl Edward Rasmussen1, Bernard J de la Cruz, Zoubin Ghahramani, David L Wild.   

Abstract

Although the use of clustering methods has rapidly become one of the standard computational approaches in the literature of microarray gene expression data, little attention has been paid to uncertainty in the results obtained. Dirichlet process mixture (DPM) models provide a nonparametric Bayesian alternative to the bootstrap approach to modeling uncertainty in gene expression clustering. Most previously published applications of Bayesian model-based clustering methods have been to short time series data. In this paper, we present a case study of the application of nonparametric Bayesian clustering methods to the clustering of high-dimensional nontime series gene expression data using full Gaussian covariances. We use the probability that two genes belong to the same cluster in a DPM model as a measure of the similarity of these gene expression profiles. Conversely, this probability can be used to define a dissimilarity measure, which, for the purposes of visualization, can be input to one of the standard linkage algorithms used for hierarchical clustering. Biologically plausible results are obtained from the Rosetta compendium of expression profiles which extend previously published cluster analyses of this data.

Mesh:

Year:  2009        PMID: 19875860     DOI: 10.1109/TCBB.2007.70269

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  9 in total

Review 1.  Probabilistic machine learning and artificial intelligence.

Authors:  Zoubin Ghahramani
Journal:  Nature       Date:  2015-05-28       Impact factor: 49.962

2.  Biomarker detection and categorization in ribonucleic acid sequencing meta-analysis using Bayesian hierarchical models.

Authors:  Tianzhou Ma; Faming Liang; George Tseng
Journal:  J R Stat Soc Ser C Appl Stat       Date:  2016-12-16       Impact factor: 1.864

3.  Mixture models with a prior on the number of components.

Authors:  Jeffrey W Miller; Matthew T Harrison
Journal:  J Am Stat Assoc       Date:  2017-11-13       Impact factor: 5.033

4.  R/BHC: fast Bayesian hierarchical clustering for microarray data.

Authors:  Richard S Savage; Katherine Heller; Yang Xu; Zoubin Ghahramani; William M Truman; Murray Grant; Katherine J Denby; David L Wild
Journal:  BMC Bioinformatics       Date:  2009-08-06       Impact factor: 3.169

5.  Discovering transcriptional modules by Bayesian data integration.

Authors:  Richard S Savage; Zoubin Ghahramani; Jim E Griffin; Bernard J de la Cruz; David L Wild
Journal:  Bioinformatics       Date:  2010-06-15       Impact factor: 6.937

6.  Bayesian hierarchical clustering for studying cancer gene expression data with unknown statistics.

Authors:  Korsuk Sirinukunwattana; Richard S Savage; Muhammad F Bari; David R J Snead; Nasir M Rajpoot
Journal:  PLoS One       Date:  2013-10-23       Impact factor: 3.240

7.  Patient-specific data fusion defines prognostic cancer subtypes.

Authors:  Yinyin Yuan; Richard S Savage; Florian Markowetz
Journal:  PLoS Comput Biol       Date:  2011-10-20       Impact factor: 4.475

8.  DGEclust: differential expression analysis of clustered count data.

Authors:  Dimitrios V Vavoulis; Margherita Francescatto; Peter Heutink; Julian Gough
Journal:  Genome Biol       Date:  2015-02-20       Impact factor: 13.583

9.  Clustering gene expression time series data using an infinite Gaussian process mixture model.

Authors:  Ian C McDowell; Dinesh Manandhar; Christopher M Vockley; Amy K Schmid; Timothy E Reddy; Barbara E Engelhardt
Journal:  PLoS Comput Biol       Date:  2018-01-16       Impact factor: 4.475

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.