Literature DB >> 16257984

Robust multi-scale clustering of large DNA microarray datasets with the consensus algorithm.

Thomas Grotkjaer1, Ole Winther, Birgitte Regenberg, Jens Nielsen, Lars Kai Hansen.   

Abstract

MOTIVATION: Hierarchical and relocation clustering (e.g. K-means and self-organizing maps) have been successful tools in the display and analysis of whole genome DNA microarray expression data. However, the results of hierarchical clustering are sensitive to outliers, and most relocation methods give results which are dependent on the initialization of the algorithm. Therefore, it is difficult to assess the significance of the results. We have developed a consensus clustering algorithm, where the final result is averaged over multiple clustering runs, giving a robust and reproducible clustering, capable of capturing small signal variations. The algorithm preserves valuable properties of hierarchical clustering, which is useful for visualization and interpretation of the results.
RESULTS: We show for the first time that one can take advantage of multiple clustering runs in DNA microarray analysis by collecting re-occurring clustering patterns in a co-occurrence matrix. The results show that consensus clustering obtained from clustering multiple times with Variational Bayes Mixtures of Gaussians or K-means significantly reduces the classification error rate for a simulated dataset. The method is flexible and it is possible to find consensus clusters from different clustering algorithms. Thus, the algorithm can be used as a framework to test in a quantitative manner the homogeneity of different clustering algorithms. We compare the method with a number of state-of-the-art clustering methods. It is shown that the method is robust and gives low classification error rates for a realistic, simulated dataset. The algorithm is also demonstrated for real datasets. It is shown that more biological meaningful transcriptional patterns can be found without conservative statistical or fold-change exclusion of data. AVAILABILITY: Matlab source code for the clustering algorithm ClusterLustre, and the simulated dataset for testing are available upon request from T.G. and O.W.

Mesh:

Year:  2005        PMID: 16257984     DOI: 10.1093/bioinformatics/bti746

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  18 in total

1.  A permutation test for determining significance of clusters with applications to spatial and gene expression data.

Authors:  P J Park; J Manjourides; M Bonetti; M Pagano
Journal:  Comput Stat Data Anal       Date:  2009-10-01       Impact factor: 1.681

2.  Global transcriptional and physiological responses of Saccharomyces cerevisiae to ammonium, L-alanine, or L-glutamine limitation.

Authors:  Renata Usaite; Kiran R Patil; Thomas Grotkjaer; Jens Nielsen; Birgitte Regenberg
Journal:  Appl Environ Microbiol       Date:  2006-09       Impact factor: 4.792

3.  Transcriptional regulation of gene expression clusters in motor neurons following spinal cord injury.

Authors:  Jesper Ryge; Ole Winther; Jacob Wienecke; Albin Sandelin; Ann-Charlotte Westerdahl; Hans Hultborn; Ole Kiehn
Journal:  BMC Genomics       Date:  2010-06-09       Impact factor: 3.969

4.  Mapping the interaction of Snf1 with TORC1 in Saccharomyces cerevisiae.

Authors:  Jie Zhang; Stefania Vaga; Pramote Chumnanpuen; Rahul Kumar; Goutham N Vemuri; Ruedi Aebersold; Jens Nielsen
Journal:  Mol Syst Biol       Date:  2011-11-08       Impact factor: 11.429

5.  Metabolic network driven analysis of genome-wide transcription data from Aspergillus nidulans.

Authors:  Helga David; Gerald Hofmann; Ana Paula Oliveira; Hanne Jarmer; Jens Nielsen
Journal:  Genome Biol       Date:  2006       Impact factor: 13.583

6.  Systemic analysis of the response of Aspergillus niger to ambient pH.

Authors:  Mikael R Andersen; Linda Lehmann; Jens Nielsen
Journal:  Genome Biol       Date:  2009-05-01       Impact factor: 13.583

7.  Proteome analysis of Aspergillus niger: lactate added in starch-containing medium can increase production of the mycotoxin fumonisin B2 by modifying acetyl-CoA metabolism.

Authors:  Louise M Sørensen; Rene Lametsch; Mikael R Andersen; Per V Nielsen; Jens C Frisvad
Journal:  BMC Microbiol       Date:  2009-12-10       Impact factor: 3.605

8.  Combinatorial effects of environmental parameters on transcriptional regulation in Saccharomyces cerevisiae: a quantitative analysis of a compendium of chemostat-based transcriptome data.

Authors:  Theo A Knijnenburg; Jean-Marc G Daran; Marcel A van den Broek; Pascale As Daran-Lapujade; Johannes H de Winde; Jack T Pronk; Marcel J T Reinders; Lodewyk F A Wessels
Journal:  BMC Genomics       Date:  2009-01-27       Impact factor: 3.969

9.  Mapping the polysaccharide degradation potential of Aspergillus niger.

Authors:  Mikael R Andersen; Malene Giese; Ronald P de Vries; Jens Nielsen
Journal:  BMC Genomics       Date:  2012-07-16       Impact factor: 3.969

10.  Transcription factor control of growth rate dependent genes in Saccharomyces cerevisiae: a three factor design.

Authors:  Alessandro Fazio; Michael C Jewett; Pascale Daran-Lapujade; Roberta Mustacchi; Renata Usaite; Jack T Pronk; Christopher T Workman; Jens Nielsen
Journal:  BMC Genomics       Date:  2008-07-18       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.