Literature DB >> 24194826

Bayesian hierarchical clustering for studying cancer gene expression data with unknown statistics.

Korsuk Sirinukunwattana1, Richard S Savage, Muhammad F Bari, David R J Snead, Nasir M Rajpoot.   

Abstract

Clustering analysis is an important tool in studying gene expression data. The Bayesian hierarchical clustering (BHC) algorithm can automatically infer the number of clusters and uses Bayesian model selection to improve clustering quality. In this paper, we present an extension of the BHC algorithm. Our Gaussian BHC (GBHC) algorithm represents data as a mixture of Gaussian distributions. It uses normal-gamma distribution as a conjugate prior on the mean and precision of each of the Gaussian components. We tested GBHC over 11 cancer and 3 synthetic datasets. The results on cancer datasets show that in sample clustering, GBHC on average produces a clustering partition that is more concordant with the ground truth than those obtained from other commonly used algorithms. Furthermore, GBHC frequently infers the number of clusters that is often close to the ground truth. In gene clustering, GBHC also produces a clustering partition that is more biologically plausible than several other state-of-the-art methods. This suggests GBHC as an alternative tool for studying gene expression data. The implementation of GBHC is available at https://sites.google.com/site/gaussianbhc/

Entities:  

Mesh:

Year:  2013        PMID: 24194826      PMCID: PMC3806770          DOI: 10.1371/journal.pone.0075748

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


  27 in total

1.  Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning.

Authors:  Margaret A Shipp; Ken N Ross; Pablo Tamayo; Andrew P Weng; Jeffery L Kutok; Ricardo C T Aguiar; Michelle Gaasenbeek; Michael Angelo; Michael Reich; Geraldine S Pinkus; Tane S Ray; Margaret A Koval; Kim W Last; Andrew Norton; T Andrew Lister; Jill Mesirov; Donna S Neuberg; Eric S Lander; Jon C Aster; Todd R Golub
Journal:  Nat Med       Date:  2002-01       Impact factor: 53.440

2.  Model-based clustering and data transformations for gene expression data.

Authors:  K Y Yeung; C Fraley; A Murua; A E Raftery; W L Ruzzo
Journal:  Bioinformatics       Date:  2001-10       Impact factor: 6.937

3.  Prediction of central nervous system embryonal tumour outcome based on gene expression.

Authors:  Scott L Pomeroy; Pablo Tamayo; Michelle Gaasenbeek; Lisa M Sturla; Michael Angelo; Margaret E McLaughlin; John Y H Kim; Liliana C Goumnerova; Peter M Black; Ching Lau; Jeffrey C Allen; David Zagzag; James M Olson; Tom Curran; Cynthia Wetmore; Jaclyn A Biegel; Tomaso Poggio; Shayan Mukherjee; Ryan Rifkin; Andrea Califano; Gustavo Stolovitzky; David N Louis; Jill P Mesirov; Eric S Lander; Todd R Golub
Journal:  Nature       Date:  2002-01-24       Impact factor: 49.962

4.  Multiclass cancer diagnosis using tumor gene expression signatures.

Authors:  S Ramaswamy; P Tamayo; R Rifkin; S Mukherjee; C H Yeang; M Angelo; C Ladd; M Reich; E Latulippe; J P Mesirov; T Poggio; W Gerald; M Loda; E S Lander; T R Golub
Journal:  Proc Natl Acad Sci U S A       Date:  2001-12-11       Impact factor: 11.205

5.  MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia.

Authors:  Scott A Armstrong; Jane E Staunton; Lewis B Silverman; Rob Pieters; Monique L den Boer; Mark D Minden; Stephen E Sallan; Eric S Lander; Todd R Golub; Stanley J Korsmeyer
Journal:  Nat Genet       Date:  2001-12-03       Impact factor: 38.330

6.  Molecular classification of human carcinomas by use of gene expression signatures.

Authors:  A I Su; J B Welsh; L M Sapinoso; S G Kern; P Dimitrov; H Lapp; P G Schultz; S M Powell; C A Moskaluk; H F Frierson; G M Hampton
Journal:  Cancer Res       Date:  2001-10-15       Impact factor: 12.701

7.  R/BHC: fast Bayesian hierarchical clustering for microarray data.

Authors:  Richard S Savage; Katherine Heller; Yang Xu; Zoubin Ghahramani; William M Truman; Murray Grant; Katherine J Denby; David L Wild
Journal:  BMC Bioinformatics       Date:  2009-08-06       Impact factor: 3.169

8.  Identification of common prognostic gene expression signatures with biological meanings from microarray gene expression datasets.

Authors:  Jun Yao; Qi Zhao; Ying Yuan; Li Zhang; Xiaoming Liu; W K Alfred Yung; John N Weinstein
Journal:  PLoS One       Date:  2012-09-21       Impact factor: 3.240

9.  Bayesian hierarchical clustering for microarray time series data with replicates and outlier measurements.

Authors:  Emma J Cooke; Richard S Savage; Paul D W Kirk; Robert Darkins; David L Wild
Journal:  BMC Bioinformatics       Date:  2011-10-13       Impact factor: 3.169

10.  Clustering cancer gene expression data: a comparative study.

Authors:  Marcilio C P de Souto; Ivan G Costa; Daniel S A de Araujo; Teresa B Ludermir; Alexander Schliep
Journal:  BMC Bioinformatics       Date:  2008-11-27       Impact factor: 3.169

View more
  3 in total

1.  A Bayesian Alternative to Mutual Information for the Hierarchical Clustering of Dependent Random Variables.

Authors:  Guillaume Marrelec; Arnaud Messé; Pierre Bellec
Journal:  PLoS One       Date:  2015-09-25       Impact factor: 3.240

2.  A comparative study of machine learning and deep learning algorithms to classify cancer types based on microarray gene expression data.

Authors:  Reinel Tabares-Soto; Simon Orozco-Arias; Victor Romero-Cano; Vanesa Segovia Bucheli; José Luis Rodríguez-Sotelo; Cristian Felipe Jiménez-Varón
Journal:  PeerJ Comput Sci       Date:  2020-04-13

Review 3.  Clustering Algorithms: Their Application to Gene Expression Data.

Authors:  Jelili Oyelade; Itunuoluwa Isewon; Funke Oladipupo; Olufemi Aromolaran; Efosa Uwoghiren; Faridah Ameh; Moses Achas; Ezekiel Adebiyi
Journal:  Bioinform Biol Insights       Date:  2016-11-30
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.