Literature DB >> 22522134

Improving GO semantic similarity measures by exploring the ontology beneath the terms and modelling uncertainty.

Haixuan Yang1, Tamás Nepusz, Alberto Paccanaro.   

Abstract

MOTIVATION: Several measures have been recently proposed for quantifying the functional similarity between gene products according to well-structured controlled vocabularies where biological terms are organized in a tree or in a directed acyclic graph (DAG) structure. However, existing semantic similarity measures ignore two important facts. First, when calculating the similarity between two terms, they disregard the descendants of these terms. While this makes no difference when the ontology is a tree, we shall show that it has important consequences when the ontology is a DAG-this is the case, for example, with the Gene Ontology (GO). Second, existing similarity measures do not model the inherent uncertainty which comes from the fact that our current knowledge of the gene annotation and of the ontology structure is incomplete. Here, we propose a novel approach based on downward random walks that can be used to improve any of the existing similarity measures to exhibit these two properties. The approach is computationally efficient-random walks do not need to be simulated as we provide formulas to calculate their stationary distributions.
RESULTS: To show that our approach can potentially improve any semantic similarity measure, we test it on six different semantic similarity measures: three commonly used measures by Resnik (1999), Lin (1998), and Jiang and Conrath (1997); and three recently proposed measures: simUI, simGIC by Pesquita et al. (2008); GraSM by Couto et al. (2007); and Couto and Silva (2011). We applied these improved measures to the GO annotations of the yeast Saccharomyces cerevisiae, and tested how they correlate with sequence similarity, mRNA co-expression and protein-protein interaction data. Our results consistently show that the use of downward random walks leads to more reliable similarity measures.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22522134     DOI: 10.1093/bioinformatics/bts129

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  28 in total

1.  Corpus domain effects on distributional semantic modeling of medical terms.

Authors:  Serguei V S Pakhomov; Greg Finley; Reed McEwan; Yan Wang; Genevieve B Melton
Journal:  Bioinformatics       Date:  2016-08-16       Impact factor: 6.937

2.  A census of human soluble protein complexes.

Authors:  Pierre C Havugimana; G Traver Hart; Tamás Nepusz; Haixuan Yang; Andrei L Turinsky; Zhihua Li; Peggy I Wang; Daniel R Boutz; Vincent Fong; Sadhna Phanse; Mohan Babu; Stephanie A Craig; Pingzhao Hu; Cuihong Wan; James Vlasblom; Vaqaar-un-Nisa Dar; Alexandr Bezginov; Gregory W Clark; Gabriel C Wu; Shoshana J Wodak; Elisabeth R M Tillier; Alberto Paccanaro; Edward M Marcotte; Andrew Emili
Journal:  Cell       Date:  2012-08-31       Impact factor: 41.582

3.  Spotlite: web application and augmented algorithms for predicting co-complexed proteins from affinity purification--mass spectrometry data.

Authors:  Dennis Goldfarb; Bridgid E Hast; Wei Wang; Michael B Major
Journal:  J Proteome Res       Date:  2014-10-20       Impact factor: 4.466

4.  Improving the measurement of semantic similarity between gene ontology terms and gene products: insights from an edge- and IC-based hybrid method.

Authors:  Xiaomei Wu; Erli Pang; Kui Lin; Zhen-Ming Pei
Journal:  PLoS One       Date:  2013-05-31       Impact factor: 3.240

5.  An integrative approach for measuring semantic similarities using gene ontology.

Authors:  Jiajie Peng; Hongxiang Li; Qinghua Jiang; Yadong Wang; Jin Chen
Journal:  BMC Syst Biol       Date:  2014-12-12

6.  Predicting protein function via downward random walks on a gene ontology.

Authors:  Guoxian Yu; Hailong Zhu; Carlotta Domeniconi; Jiming Liu
Journal:  BMC Bioinformatics       Date:  2015-08-27       Impact factor: 3.169

Review 7.  Mining protein interactomes to improve their reliability and support the advancement of network medicine.

Authors:  Gregorio Alanis-Lobato
Journal:  Front Genet       Date:  2015-09-23       Impact factor: 4.599

8.  Measuring the evolution of ontology complexity: the gene ontology case study.

Authors:  Olivier Dameron; Charles Bettembourg; Nolwenn Le Meur
Journal:  PLoS One       Date:  2013-10-11       Impact factor: 3.240

9.  Searching for synergies: matrix algebraic approaches for efficient pair screening.

Authors:  Philip Gerlee; Linnéa Schmidt; Naser Monsefi; Teresia Kling; Rebecka Jörnsten; Sven Nelander
Journal:  PLoS One       Date:  2013-07-25       Impact factor: 3.240

10.  An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods.

Authors:  Giorgio Valentini; Alberto Paccanaro; Horacio Caniza; Alfonso E Romero; Matteo Re
Journal:  Artif Intell Med       Date:  2014-03-20       Impact factor: 5.326

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.