Literature DB >> 18508478

A ground truth based comparative study on clustering of gene expression data.

Yitan Zhu1, Zuyi Wang, David J Miller, Robert Clarke, Jianhua Xuan, Eric P Hoffman, Yue Wang.   

Abstract

Given the variety of available clustering methods for gene expression data analysis, it is important to develop an appropriate and rigorous validation scheme to assess the performance and limitations of the most widely used clustering algorithms. In this paper, we present a ground truth based comparative study on the functionality, accuracy, and stability of five data clustering methods, namely hierarchical clustering, K-means clustering, self-organizing maps, standard finite normal mixture fitting, and a caBIG toolkit (VIsual Statistical Data Analyzer--VISDA), tested on sample clustering of seven published microarray gene expression datasets and one synthetic dataset. We examined the performance of these algorithms in both data-sufficient and data-insufficient cases using quantitative performance measures, including cluster number detection accuracy and mean and standard deviation of partition accuracy. The experimental results showed that VISDA, an interactive coarse-to-fine maximum likelihood fitting algorithm, is a solid performer on most of the datasets, while K-means clustering and self-organizing maps optimized by the mean squared compactness criterion generally produce more stable solutions than the other methods.

Entities:  

Mesh:

Year:  2008        PMID: 18508478      PMCID: PMC4737472          DOI: 10.2741/2972

Source DB:  PubMed          Journal:  Front Biosci        ISSN: 1093-4715


  31 in total

1.  Clustering gene expression patterns.

Authors:  A Ben-Dor; R Shamir; Z Yakhini
Journal:  J Comput Biol       Date:  1999 Fall-Winter       Impact factor: 1.479

2.  Validating clustering for gene expression data.

Authors:  K Y Yeung; D R Haynor; W L Ruzzo
Journal:  Bioinformatics       Date:  2001-04       Impact factor: 6.937

3.  Bootstrapping cluster analysis: assessing the reliability of conclusions from microarray experiments.

Authors:  M K Kerr; G A Churchill
Journal:  Proc Natl Acad Sci U S A       Date:  2001-07-24       Impact factor: 11.205

4.  CLIFF: clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts.

Authors:  E P Xing; R M Karp
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

5.  Comparisons and validation of statistical clustering techniques for microarray gene expression data.

Authors:  Susmita Datta; Somnath Datta
Journal:  Bioinformatics       Date:  2003-03-01       Impact factor: 6.937

6.  Probabilistic principal component subspaces: a hierarchical finite mixture model for data visualization.

Authors:  Y Wang; L Luo; M T Freedman; S Y Kung
Journal:  IEEE Trans Neural Netw       Date:  2000

7.  Multiclass cancer diagnosis using tumor gene expression signatures.

Authors:  S Ramaswamy; P Tamayo; R Rifkin; S Mukherjee; C H Yeang; M Angelo; C Ladd; M Reich; E Latulippe; J P Mesirov; T Poggio; W Gerald; M Loda; E S Lander; T R Golub
Journal:  Proc Natl Acad Sci U S A       Date:  2001-12-11       Impact factor: 11.205

8.  Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses.

Authors:  A Bhattacharjee; W G Richards; J Staunton; C Li; S Monti; P Vasa; C Ladd; J Beheshti; R Bueno; M Gillette; M Loda; G Weber; E J Mark; E S Lander; W Wong; B E Johnson; T R Golub; D J Sugarbaker; M Meyerson
Journal:  Proc Natl Acad Sci U S A       Date:  2001-11-13       Impact factor: 11.205

9.  Molecular classification of human carcinomas by use of gene expression signatures.

Authors:  A I Su; J B Welsh; L M Sapinoso; S G Kern; P Dimitrov; H Lapp; P G Schultz; S M Powell; C A Moskaluk; H F Frierson; G M Hampton
Journal:  Cancer Res       Date:  2001-10-15       Impact factor: 12.701

10.  Cluster analysis and display of genome-wide expression patterns.

Authors:  M B Eisen; P T Spellman; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-08       Impact factor: 11.205

View more
  4 in total

1.  Dose-related gene expression changes in forebrain following acute, low-level chlorpyrifos exposure in neonatal rats.

Authors:  Anamika Ray; Jing Liu; Patricia Ayoubi; Carey Pope
Journal:  Toxicol Appl Pharmacol       Date:  2010-08-05       Impact factor: 4.219

2.  The transcriptomic revolution and radiation biology.

Authors:  Sally A Amundson
Journal:  Int J Radiat Biol       Date:  2021-10-11       Impact factor: 3.352

3.  caBIG VISDA: modeling, visualization, and discovery for cluster analysis of genomic data.

Authors:  Yitan Zhu; Huai Li; David J Miller; Zuyi Wang; Jianhua Xuan; Robert Clarke; Eric P Hoffman; Yue Wang
Journal:  BMC Bioinformatics       Date:  2008-09-18       Impact factor: 3.169

4.  Copy number analysis indicates monoclonal origin of lethal metastatic prostate cancer.

Authors:  Wennuan Liu; Sari Laitinen; Sofia Khan; Mauno Vihinen; Jeanne Kowalski; Guoqiang Yu; Li Chen; Charles M Ewing; Mario A Eisenberger; Michael A Carducci; William G Nelson; Srinivasan Yegnasubramanian; Jun Luo; Yue Wang; Jianfeng Xu; William B Isaacs; Tapio Visakorpi; G Steven Bova
Journal:  Nat Med       Date:  2009-04-12       Impact factor: 53.440

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.