Literature DB >> 12368250

Judging the quality of gene expression-based clustering methods using gene annotation.

Francis D Gibbons1, Frederick P Roth.   

Abstract

We compare several commonly used expression-based gene clustering algorithms using a figure of merit based on the mutual information between cluster membership and known gene attributes. By studying various publicly available expression data sets we conclude that enrichment of clusters for biological function is, in general, highest at rather low cluster numbers. As a measure of dissimilarity between the expression patterns of two genes, no method outperforms Euclidean distance for ratio-based measurements, or Pearson distance for non-ratio-based measurements at the optimal choice of cluster number. We show the self-organized-map approach to be best for both measurement types at higher numbers of clusters. Clusters of genes derived from single- and average-linkage hierarchical clustering tend to produce worse-than-random results.

Mesh:

Year:  2002        PMID: 12368250      PMCID: PMC187526          DOI: 10.1101/gr.397002

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  29 in total

1.  Clustering based on conditional distributions in an auxiliary space.

Authors:  Janne Sinkkonen; Samuel Kaski
Journal:  Neural Comput       Date:  2002-01       Impact factor: 2.026

2.  A stability based method for discovering structure in clustered data.

Authors:  Asa Ben-Hur; Andre Elisseeff; Isabelle Guyon
Journal:  Pac Symp Biocomput       Date:  2002

3.  Missing value estimation methods for DNA microarrays.

Authors:  O Troyanskaya; M Cantor; G Sherlock; P Brown; T Hastie; R Tibshirani; D Botstein; R B Altman
Journal:  Bioinformatics       Date:  2001-06       Impact factor: 6.937

Review 4.  Computational analysis of microarray data.

Authors:  J Quackenbush
Journal:  Nat Rev Genet       Date:  2001-06       Impact factor: 53.242

5.  Saccharomyces Genome Database.

Authors:  Laurie Issel-Tarver; Karen R Christie; Kara Dolinski; Rey Andrada; Rama Balakrishnan; Catherine A Ball; Gail Binkley; Stan Dong; Selina S Dwight; Dianna G Fisk; Midori Harris; Mark Schroeder; Anand Sethuraman; Kane Tse; Shuai Weng; David Botstein; J Michael Cherry
Journal:  Methods Enzymol       Date:  2002       Impact factor: 1.600

6.  Cluster analysis and data visualization of large-scale gene expression data.

Authors:  G S Michaels; D B Carr; M Askenazi; S Fuhrman; X Wen; R Somogyi
Journal:  Pac Symp Biocomput       Date:  1998

7.  Exploring the metabolic and genetic control of gene expression on a genomic scale.

Authors:  J L DeRisi; V R Iyer; P O Brown
Journal:  Science       Date:  1997-10-24       Impact factor: 47.728

8.  Large-scale temporal gene expression mapping of central nervous system development.

Authors:  X Wen; S Fuhrman; G S Michaels; D B Carr; S Smith; J L Barker; R Somogyi
Journal:  Proc Natl Acad Sci U S A       Date:  1998-01-06       Impact factor: 11.205

9.  Discrimination between paralogs using microarray analysis: application to the Yap1p and Yap2p transcriptional networks.

Authors:  Barak A Cohen; Yitzhak Pilpel; Robi D Mitra; George M Church
Journal:  Mol Biol Cell       Date:  2002-05       Impact factor: 4.138

10.  A genome-wide transcriptional analysis of the mitotic cell cycle.

Authors:  R J Cho; M J Campbell; E A Winzeler; L Steinmetz; A Conway; L Wodicka; T G Wolfsberg; A E Gabrielian; D Landsman; D J Lockhart; R W Davis
Journal:  Mol Cell       Date:  1998-07       Impact factor: 17.970

View more
  78 in total

1.  LSimpute: accurate estimation of missing values in microarray data with least squares methods.

Authors:  Trond Hellem Bø; Bjarte Dysvik; Inge Jonassen
Journal:  Nucleic Acids Res       Date:  2004-02-20       Impact factor: 16.971

2.  Identification of certain cancer-mediating genes using Gaussian fuzzy cluster validity index.

Authors:  Anupam Ghosh; Rajat K De
Journal:  J Biosci       Date:  2015-10       Impact factor: 1.826

3.  Analysis of time-series gene expression data: methods, challenges, and opportunities.

Authors:  I P Androulakis; E Yang; R R Almon
Journal:  Annu Rev Biomed Eng       Date:  2007       Impact factor: 9.590

4.  Spectral preprocessing for clustering time-series gene expressions.

Authors:  Wentao Zhao; Erchin Serpedin; Edward R Dougherty
Journal:  EURASIP J Bioinform Syst Biol       Date:  2009-04-08

Review 5.  A ground truth based comparative study on clustering of gene expression data.

Authors:  Yitan Zhu; Zuyi Wang; David J Miller; Robert Clarke; Jianhua Xuan; Eric P Hoffman; Yue Wang
Journal:  Front Biosci       Date:  2008-05-01

6.  Modularity of stress response evolution.

Authors:  Amoolya H Singh; Denise M Wolf; Peggy Wang; Adam P Arkin
Journal:  Proc Natl Acad Sci U S A       Date:  2008-05-21       Impact factor: 11.205

7.  Cluster analysis: an alternative method for covariate selection in population pharmacokinetic modeling.

Authors:  Nabil Semmar; Bernard Bruguerolle; Sandrine Boullu-Ciocca; Nicolas Simon
Journal:  J Pharmacokinet Pharmacodyn       Date:  2005-08       Impact factor: 2.745

8.  A gene ontology inferred from molecular networks.

Authors:  Janusz Dutkowski; Michael Kramer; Michal A Surma; Rama Balakrishnan; J Michael Cherry; Nevan J Krogan; Trey Ideker
Journal:  Nat Biotechnol       Date:  2013-01       Impact factor: 54.908

9.  Expression profiles of switch-like genes accurately classify tissue and infectious disease phenotypes in model-based classification.

Authors:  Michael Gormley; Aydin Tozeren
Journal:  BMC Bioinformatics       Date:  2008-11-17       Impact factor: 3.169

10.  Information criterion-based clustering with order-restricted candidate profiles in short time-course microarray experiments.

Authors:  Tianqing Liu; Nan Lin; Ningzhong Shi; Baoxue Zhang
Journal:  BMC Bioinformatics       Date:  2009-05-15       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.