Literature DB >> 22958570

GO2PUB: Querying PubMed with semantic expansion of gene ontology terms.

Charles Bettembourg1, Christian Diot, Anita Burgun, Olivier Dameron.   

Abstract

BACKGROUND: With the development of high throughput methods of gene analyses, there is a growing need for mining tools to retrieve relevant articles in PubMed. As PubMed grows, literature searches become more complex and time-consuming. Automated search tools with good precision and recall are necessary. We developed GO2PUB to automatically enrich PubMed queries with gene names, symbols and synonyms annotated by a GO term of interest or one of its descendants.
RESULTS: GO2PUB enriches PubMed queries based on selected GO terms and keywords. It processes the result and displays the PMID, title, authors, abstract and bibliographic references of the articles. Gene names, symbols and synonyms that have been generated as extra keywords from the GO terms are also highlighted. GO2PUB is based on a semantic expansion of PubMed queries using the semantic inheritance between terms through the GO graph. Two experts manually assessed the relevance of GO2PUB, GoPubMed and PubMed on three queries about lipid metabolism. Experts' agreement was high (kappa = 0.88). GO2PUB returned 69% of the relevant articles, GoPubMed: 40% and PubMed: 29%. GO2PUB and GoPubMed have 17% of their results in common, corresponding to 24% of the total number of relevant results. 70% of the articles returned by more than one tool were relevant. 36% of the relevant articles were returned only by GO2PUB, 17% only by GoPubMed and 14% only by PubMed. For determining whether these results can be generalized, we generated twenty queries based on random GO terms with a granularity similar to those of the first three queries and compared the proportions of GO2PUB and GoPubMed results. These were respectively of 77% and 40% for the first queries, and of 70% and 38% for the random queries. The two experts also assessed the relevance of seven of the twenty queries (the three related to lipid metabolism and four related to other domains). Expert agreement was high (0.93 and 0.8). GO2PUB and GoPubMed performances were similar to those of the first queries.
CONCLUSIONS: We demonstrated that the use of genes annotated by either GO terms of interest or a descendant of these GO terms yields some relevant articles ignored by other tools. The comparison of GO2PUB, based on semantic expansion, with GoPubMed, based on text mining techniques, showed that both tools are complementary. The analysis of the randomly-generated queries suggests that the results obtained about lipid metabolism can be generalized to other biological processes. GO2PUB is available at http://go2pub.genouest.org.

Entities:  

Year:  2012        PMID: 22958570      PMCID: PMC3599846          DOI: 10.1186/2041-1480-3-7

Source DB:  PubMed          Journal:  J Biomed Semantics


  17 in total

1.  The Gene Ontology (GO) database and informatics resource.

Authors:  M A Harris; J Clark; A Ireland; J Lomax; M Ashburner; R Foulger; K Eilbeck; S Lewis; B Marshall; C Mungall; J Richter; G M Rubin; J A Blake; C Bult; M Dolan; H Drabkin; J T Eppig; D P Hill; L Ni; M Ringwald; R Balakrishnan; J M Cherry; K R Christie; M C Costanzo; S S Dwight; S Engel; D G Fisk; J E Hirschman; E L Hong; R S Nash; A Sethuraman; C L Theesfeld; D Botstein; K Dolinski; B Feierbach; T Berardini; S Mundodi; S Y Rhee; R Apweiler; D Barrell; E Camon; E Dimmer; V Lee; R Chisholm; P Gaudet; W Kibbe; R Kishore; E M Schwarz; P Sternberg; M Gwinn; L Hannick; J Wortman; M Berriman; V Wood; N de la Cruz; P Tonellato; P Jaiswal; T Seigfried; R White
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

2.  On the query reformulation technique for effective MEDLINE document retrieval.

Authors:  Sooyoung Yoo; Jinwook Choi
Journal:  J Biomed Inform       Date:  2010-04-13       Impact factor: 6.317

3.  EBIMed--text crunching to gather facts for proteins from Medline.

Authors:  Dietrich Rebholz-Schuhmann; Harald Kirsch; Miguel Arregui; Sylvain Gaudan; Mark Riethoven; Peter Stoehr
Journal:  Bioinformatics       Date:  2007-01-15       Impact factor: 6.937

4.  Concept-based query expansion for retrieving gene related publications from MEDLINE.

Authors:  Sérgio Matos; Joel P Arrais; João Maia-Rodrigues; José Luis Oliveira
Journal:  BMC Bioinformatics       Date:  2010-04-28       Impact factor: 3.169

5.  The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology.

Authors:  Evelyn Camon; Michele Magrane; Daniel Barrell; Vivian Lee; Emily Dimmer; John Maslen; David Binns; Nicola Harte; Rodrigo Lopez; Rolf Apweiler
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

Review 6.  PubMed and beyond: a survey of web tools for searching biomedical literature.

Authors:  Zhiyong Lu
Journal:  Database (Oxford)       Date:  2011-01-18       Impact factor: 3.451

7.  Discovering and visualizing indirect associations between biomedical concepts.

Authors:  Yoshimasa Tsuruoka; Makoto Miwa; Kaisei Hamamoto; Jun'ichi Tsujii; Sophia Ananiadou
Journal:  Bioinformatics       Date:  2011-07-01       Impact factor: 6.937

8.  Performance evaluation of Unified Medical Language System®'s synonyms expansion to query PubMed.

Authors:  Nicolas Griffon; Wiem Chebil; Laetitia Rollin; Gaetan Kerdelhue; Benoit Thirion; Jean-François Gehanno; Stéfan Jacques Darmoni
Journal:  BMC Med Inform Decis Mak       Date:  2012-02-29       Impact factor: 2.796

9.  GoPubMed: exploring PubMed with the Gene Ontology.

Authors:  Andreas Doms; Michael Schroeder
Journal:  Nucleic Acids Res       Date:  2005-07-01       Impact factor: 16.971

10.  SLIM: an alternative Web interface for MEDLINE/PubMed searches - a preliminary study.

Authors:  Michael Muin; Paul Fontelo; Fang Liu; Michael Ackerman
Journal:  BMC Med Inform Decis Mak       Date:  2005-12-01       Impact factor: 2.796

View more
  7 in total

Review 1.  Ontology-supported research on vaccine efficacy, safety and integrative biological networks.

Authors:  Yongqun He
Journal:  Expert Rev Vaccines       Date:  2014-06-07       Impact factor: 5.217

2.  Overview of the gene regulation network and the bacteria biotope tasks in BioNLP'13 shared task.

Authors:  Robert Bossy; Wiktoria Golik; Zorana Ratkovic; Dialekti Valsamou; Philippe Bessières; Claire Nédellec
Journal:  BMC Bioinformatics       Date:  2015-07-13       Impact factor: 3.169

3.  Knowledge and theme discovery across very large biological data sets using distributed queries: a prototype combining unstructured and structured data.

Authors:  Uma S Mudunuri; Mohamad Khouja; Stephen Repetski; Girish Venkataraman; Anney Che; Brian T Luke; F Pascal Girard; Robert M Stephens
Journal:  PLoS One       Date:  2013-12-02       Impact factor: 3.240

4.  Automated semantic annotation of rare disease cases: a case study.

Authors:  Maria Taboada; Hadriana Rodríguez; Diego Martínez; María Pardo; María Jesús Sobrido
Journal:  Database (Oxford)       Date:  2014-06-04       Impact factor: 3.451

5.  The Interaction Network Ontology-supported modeling and mining of complex interactions represented with multiple keywords in biomedical literature.

Authors:  Arzucan Özgür; Junguk Hur; Yongqun He
Journal:  BioData Min       Date:  2016-12-19       Impact factor: 2.522

6.  Ontology-based systematical representation and drug class effect analysis of package insert-reported adverse events associated with cardiovascular drugs used in China.

Authors:  Liwei Wang; Mei Li; Jiangan Xie; Yuying Cao; Hongfang Liu; Yongqun He
Journal:  Sci Rep       Date:  2017-10-23       Impact factor: 4.379

7.  Research on Literature Clustering Algorithm for Massive Scientific and Technical Literature Query Service.

Authors:  Chen Zhang
Journal:  Comput Intell Neurosci       Date:  2022-08-21
  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.