Literature DB >> 12801867

K-ary clustering with optimal leaf ordering for gene expression data.

Ziv Bar-Joseph1, Erik D Demaine, David K Gifford, Nathan Srebro, Angèle M Hamel, Tommi S Jaakkola.   

Abstract

MOTIVATION: A major challenge in gene expression analysis is effective data organization and visualization. One of the most popular tools for this task is hierarchical clustering. Hierarchical clustering allows a user to view relationships in scales ranging from single genes to large sets of genes, while at the same time providing a global view of the expression data. However, hierarchical clustering is very sensitive to noise, it usually lacks of a method to actually identify distinct clusters, and produces a large number of possible leaf orderings of the hierarchical clustering tree. In this paper we propose a new hierarchical clustering algorithm which reduces susceptibility to noise, permits up to k siblings to be directly related, and provides a single optimal order for the resulting tree.
RESULTS: We present an algorithm that efficiently constructs a k-ary tree, where each node can have up to k children, and then optimally orders the leaves of that tree. By combining k clusters at each step our algorithm becomes more robust against noise and missing values. By optimally ordering the leaves of the resulting tree we maintain the pairwise relationships that appear in the original method, without sacrificing the robustness. Our k-ary construction algorithm runs in O(n(3)) regardless of k and our ordering algorithm runs in O(4(k)n(3)). We present several examples that show that our k-ary clustering algorithm achieves results that are superior to the binary tree results in both global presentation and cluster identification. AVAILABILITY: We have implemented the above algorithms in C++ on the Linux operating system.

Entities:  

Mesh:

Year:  2003        PMID: 12801867     DOI: 10.1093/bioinformatics/btg030

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  17 in total

1.  Gibberellins regulate lateral root formation in Populus through interactions with auxin and other hormones.

Authors:  Jiqing Gou; Steven H Strauss; Chung Jui Tsai; Kai Fang; Yiru Chen; Xiangning Jiang; Victor B Busov
Journal:  Plant Cell       Date:  2010-03-30       Impact factor: 11.277

2.  A visualization system for space-time and multivariate patterns (VIS-STAMP).

Authors:  Diansheng Guo; Jin Chen; Alan M MacEachren; Ke Liao
Journal:  IEEE Trans Vis Comput Graph       Date:  2006 Nov-Dec       Impact factor: 4.579

3.  Pedigreed primate embryonic stem cells express homogeneous familial gene profiles.

Authors:  Jocelyn D Mich-Basso; Carrie J Redinger; Christopher S Navara; Ahmi Ben-Yehudah; Ethan Jacoby; Elizabeta Kovkarova-Naumovski; Meena Sukhwani; Kyle Orwig; Naftali Kaminski; Carlos A Castro; Calvin R Simerly; Gerald Schatten
Journal:  Stem Cells       Date:  2007-07-19       Impact factor: 6.277

4.  Supporting the Process of Exploring and Interpreting Space-Time Multivariate Patterns: The Visual Inquiry Toolkit.

Authors:  Jin Chen; Alan M Maceachren; Diansheng Guo
Journal:  Cartogr Geogr Inf Sci       Date:  2008-01-01

5.  Rapid resistome fingerprinting and clonal lineage profiling of carbapenem-resistant Klebsiella pneumoniae isolates by targeted next-generation sequencing.

Authors:  Fabio Arena; P Alexander Rolfe; Graeme Doran; Viola Conte; Sarah Gruszka; Thomas Clarke; Yemi Adesokan; Tommaso Giani; Gian Maria Rossolini
Journal:  J Clin Microbiol       Date:  2014-01-08       Impact factor: 5.948

6.  Constructing overview+detail dendrogram-matrix views.

Authors:  Jin Chen; Alan M MacEachren; Donna J Peuquet
Journal:  IEEE Trans Vis Comput Graph       Date:  2009 Nov-Dec       Impact factor: 4.579

7.  Ligand-dependent dynamics of retinoic acid receptor binding during early neurogenesis.

Authors:  Shaun Mahony; Esteban O Mazzoni; Scott McCuine; Richard A Young; Hynek Wichterle; David K Gifford
Journal:  Genome Biol       Date:  2011-01-13       Impact factor: 13.583

8.  Biclustering via optimal re-ordering of data matrices in systems biology: rigorous methods and comparative studies.

Authors:  Peter A DiMaggio; Scott R McAllister; Christodoulos A Floudas; Xiao-Jiang Feng; Joshua D Rabinowitz; Herschel A Rabitz
Journal:  BMC Bioinformatics       Date:  2008-10-27       Impact factor: 3.169

9.  An improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles.

Authors:  Appala Raju Kotaru; Khader Shameer; Pandurangan Sundaramurthy; Ramesh Chandra Joshi
Journal:  Bioinformation       Date:  2013-04-13

10.  An improved method for identifying functionally linked proteins using phylogenetic profiles.

Authors:  Shawn Cokus; Sayaka Mizutani; Matteo Pellegrini
Journal:  BMC Bioinformatics       Date:  2007-05-22       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.