Literature DB >> 22078312

Exact score distribution computation for ontological similarity searches.

Marcel H Schulz1, Sebastian Köhler, Sebastian Bauer, Peter N Robinson.   

Abstract

BACKGROUND: Semantic similarity searches in ontologies are an important component of many bioinformatic algorithms, e.g., finding functionally related proteins with the Gene Ontology or phenotypically similar diseases with the Human Phenotype Ontology (HPO). We have recently shown that the performance of semantic similarity searches can be improved by ranking results according to the probability of obtaining a given score at random rather than by the scores themselves. However, to date, there are no algorithms for computing the exact distribution of semantic similarity scores, which is necessary for computing the exact P-value of a given score.
RESULTS: In this paper we consider the exact computation of score distributions for similarity searches in ontologies, and introduce a simple null hypothesis which can be used to compute a P-value for the statistical significance of similarity scores. We concentrate on measures based on Resnik's definition of ontological similarity. A new algorithm is proposed that collapses subgraphs of the ontology graph and thereby allows fast score distribution computation. The new algorithm is several orders of magnitude faster than the naive approach, as we demonstrate by computing score distributions for similarity searches in the HPO. It is shown that exact P-value calculation improves clinical diagnosis using the HPO compared to approaches based on sampling.
CONCLUSIONS: The new algorithm enables for the first time exact P-value calculation via exact score distribution computation for ontology similarity searches. The approach is applicable to any ontology for which the annotation-propagation rule holds and can improve any bioinformatic method that makes only use of the raw similarity scores. The algorithm was implemented in Java, supports any ontology in OBO format, and is available for non-commercial and academic usage under: https://compbio.charite.de/svn/hpo/trunk/src/tools/significance/

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 22078312      PMCID: PMC3240574          DOI: 10.1186/1471-2105-12-441

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  17 in total

1.  A reference ontology for biomedical informatics: the Foundational Model of Anatomy.

Authors:  Cornelius Rosse; José L V Mejino
Journal:  J Biomed Inform       Date:  2003-12       Impact factor: 6.317

2.  Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation.

Authors:  P W Lord; R D Stevens; A Brass; C A Goble
Journal:  Bioinformatics       Date:  2003-07-01       Impact factor: 6.937

3.  The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease.

Authors:  Peter N Robinson; Sebastian Köhler; Sebastian Bauer; Dominik Seelow; Denise Horn; Stefan Mundlos
Journal:  Am J Hum Genet       Date:  2008-10-23       Impact factor: 11.025

4.  Clinical diagnostics in human genetics with semantic similarity searches in ontologies.

Authors:  Sebastian Köhler; Marcel H Schulz; Peter Krawitz; Sebastian Bauer; Sandra Dölken; Claus E Ott; Christine Mundlos; Denise Horn; Stefan Mundlos; Peter N Robinson
Journal:  Am J Hum Genet       Date:  2009-10       Impact factor: 11.025

5.  Revealing and avoiding bias in semantic similarity scores for protein pairs.

Authors:  Jing Wang; Xianxiao Zhou; Jing Zhu; Chenggui Zhou; Zheng Guo
Journal:  BMC Bioinformatics       Date:  2010-05-28       Impact factor: 3.169

6.  The Sequence Ontology: a tool for the unification of genome annotations.

Authors:  Karen Eilbeck; Suzanna E Lewis; Christopher J Mungall; Mark Yandell; Lincoln Stein; Richard Durbin; Michael Ashburner
Journal:  Genome Biol       Date:  2005-04-29       Impact factor: 13.583

7.  The Mammalian Phenotype Ontology as a tool for annotating, analyzing and comparing phenotypic information.

Authors:  Cynthia L Smith; Carroll-Ann W Goldsmith; Janan T Eppig
Journal:  Genome Biol       Date:  2004-12-15       Impact factor: 13.583

8.  An ontology for cell types.

Authors:  Jonathan Bard; Seung Y Rhee; Michael Ashburner
Journal:  Genome Biol       Date:  2005-01-14       Impact factor: 13.583

9.  Mouse, man, and meaning: bridging the semantics of mouse phenotype and human disease.

Authors:  John M Hancock; Ann-Marie Mallon; Tim Beck; Georgios V Gkoutos; Chris Mungall; Paul N Schofield
Journal:  Mamm Genome       Date:  2009-08-02       Impact factor: 2.957

10.  Evaluation of GO-based functional similarity measures using S. cerevisiae protein interaction and expression profile data.

Authors:  Tao Xu; Linfang Du; Yan Zhou
Journal:  BMC Bioinformatics       Date:  2008-11-06       Impact factor: 3.169

View more
  9 in total

1.  Bayesian ontology querying for accurate and noise-tolerant semantic searches.

Authors:  Sebastian Bauer; Sebastian Köhler; Marcel H Schulz; Peter N Robinson
Journal:  Bioinformatics       Date:  2012-07-26       Impact factor: 6.937

Review 2.  [From symptom to syndrome using modern software support].

Authors:  S Köhler
Journal:  Internist (Berl)       Date:  2018-08       Impact factor: 0.743

Review 3.  The case for open science: rare diseases.

Authors:  Yaffa R Rubinstein; Peter N Robinson; William A Gahl; Paul Avillach; Gareth Baynam; Helene Cederroth; Rebecca M Goodwin; Stephen C Groft; Mats G Hansson; Nomi L Harris; Vojtech Huser; Deborah Mascalzoni; Julie A McMurry; Matthew Might; Christoffer Nellaker; Barend Mons; Dina N Paltoo; Jonathan Pevsner; Manuel Posada; Alison P Rockett-Frase; Marco Roos; Tamar B Rubinstein; Domenica Taruscio; Esther van Enckevort; Melissa A Haendel
Journal:  JAMIA Open       Date:  2020-09-11

4.  Modeling seizures in the Human Phenotype Ontology according to contemporary ILAE concepts makes big phenotypic data tractable.

Authors:  David Lewis-Smith; Peter D Galer; Ganna Balagura; Hugh Kearney; Shiva Ganesan; Mahgenn Cosico; Margaret O'Brien; Priya Vaidiswaran; Roland Krause; Colin A Ellis; Rhys H Thomas; Peter N Robinson; Ingo Helbig
Journal:  Epilepsia       Date:  2021-05-05       Impact factor: 6.740

5.  Exploring Approaches for Detecting Protein Functional Similarity within an Orthology-based Framework.

Authors:  Christian X Weichenberger; Antonia Palermo; Peter P Pramstaller; Francisco S Domingues
Journal:  Sci Rep       Date:  2017-03-23       Impact factor: 4.379

6.  Evaluating the effect of annotation size on measures of semantic similarity.

Authors:  Maxat Kulmanov; Robert Hoehndorf
Journal:  J Biomed Semantics       Date:  2017-02-13

7.  Improved ontology-based similarity calculations using a study-wise annotation model.

Authors:  Sebastian Köhler
Journal:  Database (Oxford)       Date:  2018-01-01       Impact factor: 3.451

Review 8.  Strategies to Uplift Novel Mendelian Gene Discovery for Improved Clinical Outcomes.

Authors:  Eleanor G Seaby; Heidi L Rehm; Anne O'Donnell-Luria
Journal:  Front Genet       Date:  2021-06-17       Impact factor: 4.599

Review 9.  Phenotype-driven strategies for exome prioritization of human Mendelian disease genes.

Authors:  Damian Smedley; Peter N Robinson
Journal:  Genome Med       Date:  2015-07-30       Impact factor: 11.117

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.