Literature DB >> 20377460

Finding biologically accurate clusterings in hierarchical tree decompositions using the variation of information.

Saket Navlakha1, James White, Niranjan Nagarajan, Mihai Pop, Carl Kingsford.   

Abstract

Hierarchical clustering is a popular method for grouping together similar elements based on a distance measure between them. In many cases, annotations for some elements are known beforehand, which can aid the clustering process. We present a novel approach for decomposing a hierarchical clustering into the clusters that optimally match a set of known annotations, as measured by the variation of information metric. Our approach is general and does not require the user to enter the number of clusters desired. We apply it to two biological domains: finding protein complexes within protein interaction networks and identifying species within metagenomic DNA samples. For these two applications, we test the quality of our clusters by using them to predict complex and species membership, respectively. We find that our approach generally outperforms the commonly used heuristic methods.

Mesh:

Substances:

Year:  2010        PMID: 20377460     DOI: 10.1089/cmb.2009.0173

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  12 in total

1.  The power of protein interaction networks for associating genes with diseases.

Authors:  Saket Navlakha; Carl Kingsford
Journal:  Bioinformatics       Date:  2010-02-24       Impact factor: 6.937

2.  Alignment and clustering of phylogenetic markers--implications for microbial diversity studies.

Authors:  James R White; Saket Navlakha; Niranjan Nagarajan; Mohammad-Reza Ghodsi; Carl Kingsford; Mihai Pop
Journal:  BMC Bioinformatics       Date:  2010-03-24       Impact factor: 3.169

3.  FUSE: a profit maximization approach for functional summarization of biological networks.

Authors:  Boon-Siew Seah; Sourav S Bhowmick; C Forbes Dewey; Hanry Yu
Journal:  BMC Bioinformatics       Date:  2012-03-21       Impact factor: 3.169

4.  Clustering metagenomic sequences with interpolated Markov models.

Authors:  David R Kelley; Steven L Salzberg
Journal:  BMC Bioinformatics       Date:  2010-11-02       Impact factor: 3.169

5.  DNACLUST: accurate and efficient clustering of phylogenetic marker genes.

Authors:  Mohammadreza Ghodsi; Bo Liu; Mihai Pop
Journal:  BMC Bioinformatics       Date:  2011-06-30       Impact factor: 3.169

6.  Similarity maps and hierarchical clustering for annotating FT-IR spectral images.

Authors:  Qiaoyong Zhong; Chen Yang; Frederik Großerüschkamp; Angela Kallenbach-Thieltges; Peter Serocka; Klaus Gerwert; Axel Mosig
Journal:  BMC Bioinformatics       Date:  2013-11-20       Impact factor: 3.169

7.  Semi-supervised adaptive-height snipping of the hierarchical clustering tree.

Authors:  Askar Obulkasim; Gerrit A Meijer; Mark A van de Wiel
Journal:  BMC Bioinformatics       Date:  2015-01-16       Impact factor: 3.169

Review 8.  HCsnip: An R Package for Semi-supervised Snipping of the Hierarchical Clustering Tree.

Authors:  Askar Obulkasim; Mark A van de Wiel
Journal:  Cancer Inform       Date:  2015-03-22

9.  Prediction of disease-related genes based on weighted tissue-specific networks by using DNA methylation.

Authors:  Min Li; Jiayi Zhang; Qing Liu; Jianxin Wang; Fang-Xiang Wu
Journal:  BMC Med Genomics       Date:  2014-10-22       Impact factor: 3.063

Review 10.  Phylogenetics and the human microbiome.

Authors:  Frederick A Matsen
Journal:  Syst Biol       Date:  2014-08-07       Impact factor: 15.683

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.