Literature DB >> 31697362

Enabling semantic queries across federated bioinformatics databases.

Ana Claudia Sima1,2,3,4, Tarcisio Mendes de Farias2,3,4,5, Erich Zbinden1,4, Maria Anisimova1,4, Manuel Gil1,4, Heinz Stockinger4, Kurt Stockinger1, Marc Robinson-Rechavi4,5, Christophe Dessimoz2,3,4,6,7.   

Abstract

MOTIVATION: Data integration promises to be one of the main catalysts in enabling new insights to be drawn from the wealth of biological data available publicly. However, the heterogeneity of the different data sources, both at the syntactic and the semantic level, still poses significant challenges for achieving interoperability among biological databases.
RESULTS: We introduce an ontology-based federated approach for data integration. We applied this approach to three heterogeneous data stores that span different areas of biological knowledge: (i) Bgee, a gene expression relational database; (ii) Orthologous Matrix (OMA), a Hierarchical Data Format 5 orthology DS; and (iii) UniProtKB, a Resource Description Framework (RDF) store containing protein sequence and functional information. To enable federated queries across these sources, we first defined a new semantic model for gene expression called GenEx. We then show how the relational data in Bgee can be expressed as a virtual RDF graph, instantiating GenEx, through dedicated relational-to-RDF mappings. By applying these mappings, Bgee data are now accessible through a public SPARQL endpoint. Similarly, the materialized RDF data of OMA, expressed in terms of the Orthology ontology, is made available in a public SPARQL endpoint. We identified and formally described intersection points (i.e. virtual links) among the three data sources. These allow performing joint queries across the data stores. Finally, we lay the groundwork to enable nontechnical users to benefit from the integrated data, by providing a natural language template-based search interface. © The authors 2019. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications.

Entities:  

Mesh:

Year:  2019        PMID: 31697362      PMCID: PMC6836710          DOI: 10.1093/database/baz106

Source DB:  PubMed          Journal:  Database (Oxford)        ISSN: 1758-0463            Impact factor:   3.451


  33 in total

1.  Semantic integration of data on transcriptional regulation.

Authors:  Michael Baitaluk; Julia Ponomarenko
Journal:  Bioinformatics       Date:  2010-04-28       Impact factor: 6.937

2.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration.

Authors:  Barry Smith; Michael Ashburner; Cornelius Rosse; Jonathan Bard; William Bug; Werner Ceusters; Louis J Goldberg; Karen Eilbeck; Amelia Ireland; Christopher J Mungall; Neocles Leontis; Philippe Rocca-Serra; Alan Ruttenberg; Susanna-Assunta Sansone; Richard H Scheuermann; Nigam Shah; Patricia L Whetzel; Suzanna Lewis
Journal:  Nat Biotechnol       Date:  2007-11       Impact factor: 54.908

3.  Bio2RDF: towards a mashup to build bioinformatics knowledge systems.

Authors:  François Belleau; Marc-Alexandre Nolin; Nicole Tourigny; Philippe Rigault; Jean Morissette
Journal:  J Biomed Inform       Date:  2008-03-21       Impact factor: 6.317

4.  The EBI RDF platform: linked open data for the life sciences.

Authors:  Simon Jupp; James Malone; Jerven Bolleman; Marco Brandizi; Mark Davies; Leyla Garcia; Anna Gaulton; Sebastien Gehant; Camille Laibe; Nicole Redaschi; Sarala M Wimalaratne; Maria Martin; Nicolas Le Novère; Helen Parkinson; Ewan Birney; Andrew M Jenkinson
Journal:  Bioinformatics       Date:  2014-01-11       Impact factor: 6.937

5.  UniProt: the universal protein knowledgebase.

Authors:  The UniProt Consortium
Journal:  Nucleic Acids Res       Date:  2018-03-16       Impact factor: 16.971

6.  PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets.

Authors:  Marija Djokic-Petrovic; Vladimir Cvjetkovic; Jeremy Yang; Marko Zivanovic; David J Wild
Journal:  J Biomed Semantics       Date:  2017-09-20

7.  An ontology-guided semantic data integration framework to support integrative data analysis of cancer survival.

Authors:  Hansi Zhang; Yi Guo; Qian Li; Thomas J George; Elizabeth Shenkman; François Modave; Jiang Bian
Journal:  BMC Med Inform Decis Mak       Date:  2018-07-23       Impact factor: 2.796

8.  Inferring hierarchical orthologous groups from orthologous gene pairs.

Authors:  Adrian M Altenhoff; Manuel Gil; Gaston H Gonnet; Christophe Dessimoz
Journal:  PLoS One       Date:  2013-01-14       Impact factor: 3.240

9.  Expression Atlas update--a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments.

Authors:  Robert Petryszak; Tony Burdett; Benedetto Fiorelli; Nuno A Fonseca; Mar Gonzalez-Porta; Emma Hastings; Wolfgang Huber; Simon Jupp; Maria Keays; Nataliya Kryvych; Julie McMurry; John C Marioni; James Malone; Karine Megy; Gabriella Rustici; Amy Y Tang; Jan Taubert; Eleanor Williams; Oliver Mannion; Helen E Parkinson; Alvis Brazma
Journal:  Nucleic Acids Res       Date:  2013-12-04       Impact factor: 16.971

10.  Gearing up to handle the mosaic nature of life in the quest for orthologs.

Authors:  Kristoffer Forslund; Cecile Pereira; Salvador Capella-Gutierrez; Alan Sousa da Silva; Adrian Altenhoff; Jaime Huerta-Cepas; Matthieu Muffato; Mateus Patricio; Klaas Vandepoele; Ingo Ebersberger; Judith Blake; Jesualdo Tomás Fernández Breis; Brigitte Boeckmann; Toni Gabaldón; Erik Sonnhammer; Christophe Dessimoz; Suzanna Lewis
Journal:  Bioinformatics       Date:  2018-01-15       Impact factor: 6.937

View more
  7 in total

1.  The Bgee suite: integrated curated expression atlas and comparative transcriptomics in animals.

Authors:  Frederic B Bastian; Julien Roux; Anne Niknejad; Aurélie Comte; Sara S Fonseca Costa; Tarcisio Mendes de Farias; Sébastien Moretti; Gilles Parmentier; Valentine Rech de Laval; Marta Rosikiewicz; Julien Wollbrett; Amina Echchiki; Angélique Escoriza; Walid H Gharib; Mar Gonzales-Porta; Yohan Jarosz; Balazs Laurenczy; Philippe Moret; Emilie Person; Patrick Roelli; Komal Sanjeev; Mathieu Seppey; Marc Robinson-Rechavi
Journal:  Nucleic Acids Res       Date:  2021-01-08       Impact factor: 16.971

Review 2.  Big-Data Glycomics: Tools to Connect Glycan Biosynthesis to Extracellular Communication.

Authors:  Benjamin P Kellman; Nathan E Lewis
Journal:  Trends Biochem Sci       Date:  2020-12-18       Impact factor: 13.807

3.  A hands-on introduction to querying evolutionary relationships across multiple data sources using SPARQL.

Authors:  Ana Claudia Sima; Christophe Dessimoz; Kurt Stockinger; Monique Zahn-Zabal; Tarcisio Mendes de Farias
Journal:  F1000Res       Date:  2019-10-29

4.  Querying knowledge graphs in natural language.

Authors:  Shiqi Liang; Kurt Stockinger; Tarcisio Mendes de Farias; Maria Anisimova; Manuel Gil
Journal:  J Big Data       Date:  2021-01-06

5.  Visualization Environment for Federated Knowledge Graphs: Development of an Interactive Biomedical Query Language and Web Application Interface.

Authors:  Steven Cox; Stanley C Ahalt; James Balhoff; Chris Bizon; Karamarie Fecho; Yaphet Kebede; Kenneth Morton; Alexander Tropsha; Patrick Wang; Hao Xu
Journal:  JMIR Med Inform       Date:  2020-11-23

6.  Ten Years of Collaborative Progress in the Quest for Orthologs.

Authors:  Benjamin Linard; Ingo Ebersberger; Shawn E McGlynn; Natasha Glover; Tomohiro Mochizuki; Mateus Patricio; Odile Lecompte; Yannis Nevers; Paul D Thomas; Toni Gabaldón; Erik Sonnhammer; Christophe Dessimoz; Ikuo Uchiyama
Journal:  Mol Biol Evol       Date:  2021-07-29       Impact factor: 16.240

7.  Bio-SODA UX: enabling natural language question answering over knowledge graphs with user disambiguation.

Authors:  Ana Claudia Sima; Tarcisio Mendes de Farias; Maria Anisimova; Christophe Dessimoz; Marc Robinson-Rechavi; Erich Zbinden; Kurt Stockinger
Journal:  Distrib Parallel Databases       Date:  2022-07-16       Impact factor: 0.974

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.