Literature DB >> 12611631

How well do we understand the clusters found in microarray data?

Amanda Clare1, Ross D King.   

Abstract

We wished to quantify the state-of-the-art of our understanding of clusters in microarray data. To do this we systematically compared the clusters produced on sets of microarray data using a representative set of clustering algorithms (hierarchical, k-means, and a modified version of QT_CLUST) with the annotation schemes MIPS, GeneOntology and GenProtEC. We assumed that if a cluster reflected known biology its members would share related ontological annotations. This assumption is the basis of "guilt-by-association" and is commonly used to assign the putative function of proteins. To statistically measure the relationship between cluster and annotation we developed a new predictive discriminatory measure. We found that the clusters found in microarray data do not in general agree with functional annotation classes. Although many statistically significant relationships can be found, the majority of clusters are not related to known biology (as described in annotation ontologies). This implies that use of guilt-by-association is not supported by annotation ontologies. Depending on the estimate of the amount of noise in the data, our results suggest that bioinformatics has only codified a small proportion of the biological knowledge required to understand microarray data.

Mesh:

Substances:

Year:  2002        PMID: 12611631

Source DB:  PubMed          Journal:  In Silico Biol        ISSN: 1386-6338


  8 in total

1.  The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes.

Authors:  Andreas Ruepp; Alfred Zollner; Dieter Maier; Kaj Albermann; Jean Hani; Martin Mokrejs; Igor Tetko; Ulrich Güldener; Gertrud Mannhaupt; Martin Münsterkötter; H Werner Mewes
Journal:  Nucleic Acids Res       Date:  2004-10-14       Impact factor: 16.971

Review 2.  Gene expression profiling and the use of genome-scale in silico models of Escherichia coli for analysis: providing context for content.

Authors:  Nathan E Lewis; Byung-Kwan Cho; Eric M Knight; Bernhard O Palsson
Journal:  J Bacteriol       Date:  2009-04-10       Impact factor: 3.490

3.  Systematic survey reveals general applicability of "guilt-by-association" within gene coexpression networks.

Authors:  Cecily J Wolfe; Isaac S Kohane; Atul J Butte
Journal:  BMC Bioinformatics       Date:  2005-09-14       Impact factor: 3.169

4.  The Mouse Functional Genome Database (MfunGD): functional annotation of proteins in the light of their cellular context.

Authors:  Andreas Ruepp; Octave Noubibou Doudieu; Jos van den Oever; Barbara Brauner; Irmtraud Dunger-Kaltenbach; Gisela Fobo; Goar Frishman; Corinna Montrone; Christine Skornia; Steffi Wanka; Thomas Rattei; Philipp Pagel; Louise Riley; Dmitrij Frishman; Dimitrij Surmeli; Igor V Tetko; Matthias Oesterheld; Volker Stümpflen; H Werner Mewes
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

5.  Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks.

Authors:  David J Reiss; Nitin S Baliga; Richard Bonneau
Journal:  BMC Bioinformatics       Date:  2006-06-02       Impact factor: 3.169

6.  Large-scale clustering of CAGE tag expression data.

Authors:  Kazuro Shimokawa; Yuko Okamura-Oho; Takio Kurita; Martin C Frith; Jun Kawai; Piero Carninci; Yoshihide Hayashizaki
Journal:  BMC Bioinformatics       Date:  2007-05-21       Impact factor: 3.169

7.  Recursive cluster elimination (RCE) for classification and feature selection from gene expression data.

Authors:  Malik Yousef; Segun Jung; Louise C Showe; Michael K Showe
Journal:  BMC Bioinformatics       Date:  2007-05-02       Impact factor: 3.169

8.  Public databases and software for the pathway analysis of cancer genomes.

Authors:  Ivy F L Tsui; Raj Chari; Timon P H Buys; Wan L Lam
Journal:  Cancer Inform       Date:  2007-12-12
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.