Literature DB >> 17540677

Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications.

Haiyuan Yu1, Ronald Jansen, Gustavo Stolovitzky, Mark Gerstein.   

Abstract

MOTIVATION: Many classifications of protein function such as Gene Ontology (GO) are organized in directed acyclic graph (DAG) structures. In these classifications, the proteins are terminal leaf nodes; the categories 'above' them are functional annotations at various levels of specialization and the computation of a numerical measure of relatedness between two arbitrary proteins is an important proteomics problem. Moreover, analogous problems are important in other contexts in large-scale information organization--e.g. the Wikipedia online encyclopedia and the Yahoo and DMOZ web page classification schemes.
RESULTS: Here we develop a simple probabilistic approach for computing this relatedness quantity, which we call the total ancestry method. Our measure is based on counting the number of leaf nodes that share exactly the same set of 'higher up' category nodes in comparison to the total number of classified pairs (i.e. the chance for the same total ancestry). We show such a measure is associated with a power-law distribution, allowing for the quick assessment of the statistical significance of shared functional annotations. We formally compare it with other quantitative functional similarity measures (such as, shortest path within a DAG, lowest common ancestor shared and Azuaje's information-theoretic similarity) and provide concrete metrics to assess differences. Finally, we provide a practical implementation for our total ancestry measure for GO and the MIPS functional catalog and give two applications of it in specific functional genomics contexts. AVAILABILITY: The implementations and results are available through our supplementary website at: http://gersteinlab.org/proj/funcsim. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17540677     DOI: 10.1093/bioinformatics/btm291

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  24 in total

1.  Exact score distribution computation for ontological similarity searches.

Authors:  Marcel H Schulz; Sebastian Köhler; Sebastian Bauer; Peter N Robinson
Journal:  BMC Bioinformatics       Date:  2011-11-12       Impact factor: 3.169

2.  Genome-scale analysis of interaction dynamics reveals organization of biological networks.

Authors:  Jishnu Das; Jaaved Mohammed; Haiyuan Yu
Journal:  Bioinformatics       Date:  2012-05-09       Impact factor: 6.937

3.  Dissecting disease inheritance modes in a three-dimensional protein network challenges the "guilt-by-association" principle.

Authors:  Yu Guo; Xiaomu Wei; Jishnu Das; Andrew Grimson; Steven M Lipkin; Andrew G Clark; Haiyuan Yu
Journal:  Am J Hum Genet       Date:  2013-06-20       Impact factor: 11.025

4.  Getting started in gene orthology and functional analysis.

Authors:  Gang Fang; Nitin Bhardwaj; Rebecca Robilotto; Mark B Gerstein
Journal:  PLoS Comput Biol       Date:  2010-03-26       Impact factor: 4.475

5.  Assessing the functional coherence of gene sets with metrics based on the Gene Ontology graph.

Authors:  Adam J Richards; Brian Muller; Matthew Shotwell; L Ashley Cowart; Bäerbel Rohrer; Xinghua Lu
Journal:  Bioinformatics       Date:  2010-06-15       Impact factor: 6.937

6.  microRNA-122 as a regulator of mitochondrial metabolic gene network in hepatocellular carcinoma.

Authors:  Julja Burchard; Chunsheng Zhang; Angela M Liu; Ronnie T P Poon; Nikki P Y Lee; Kwong-Fai Wong; Pak C Sham; Brian Y Lam; Mark D Ferguson; George Tokiwa; Ryan Smith; Brendan Leeson; Rebecca Beard; John R Lamb; Lee Lim; Mao Mao; Hongyue Dai; John M Luk
Journal:  Mol Syst Biol       Date:  2010-08-24       Impact factor: 11.429

7.  High-quality binary protein interaction map of the yeast interactome network.

Authors:  Haiyuan Yu; Pascal Braun; Muhammed A Yildirim; Irma Lemmens; Kavitha Venkatesan; Julie Sahalie; Tomoko Hirozane-Kishikawa; Fana Gebreab; Na Li; Nicolas Simonis; Tong Hao; Jean-François Rual; Amélie Dricot; Alexei Vazquez; Ryan R Murray; Christophe Simon; Leah Tardivo; Stanley Tam; Nenad Svrzikapa; Changyu Fan; Anne-Sophie de Smet; Adriana Motyl; Michael E Hudson; Juyong Park; Xiaofeng Xin; Michael E Cusick; Troy Moore; Charlie Boone; Michael Snyder; Frederick P Roth; Albert-László Barabási; Jan Tavernier; David E Hill; Marc Vidal
Journal:  Science       Date:  2008-08-21       Impact factor: 47.728

8.  Clinical diagnostics in human genetics with semantic similarity searches in ontologies.

Authors:  Sebastian Köhler; Marcel H Schulz; Peter Krawitz; Sebastian Bauer; Sandra Dölken; Claus E Ott; Christine Mundlos; Denise Horn; Stefan Mundlos; Peter N Robinson
Journal:  Am J Hum Genet       Date:  2009-10       Impact factor: 11.025

9.  Finding local communities in protein networks.

Authors:  Konstantin Voevodski; Shang-Hua Teng; Yu Xia
Journal:  BMC Bioinformatics       Date:  2009-09-18       Impact factor: 3.169

10.  Spectral affinity in protein networks.

Authors:  Konstantin Voevodski; Shang-Hua Teng; Yu Xia
Journal:  BMC Syst Biol       Date:  2009-11-29
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.