Literature DB >> 22373359

Gauging triple stores with actual biological data.

Vladimir Mironov, Nirmala Seethappan, Ward Blondé, Erick Antezana, Andrea Splendiani, Martin Kuiper.   

Abstract

BACKGROUND: Semantic Web technologies have been developed to overcome the limitations of the current Web and conventional data integration solutions. The Semantic Web is expected to link all the data present on the Internet instead of linking just documents. One of the foundations of the Semantic Web technologies is the knowledge representation language Resource Description Framework (RDF). Knowledge expressed in RDF is typically stored in so-called triple stores (also known as RDF stores), from which it can be retrieved with SPARQL, a language designed for querying RDF-based models. The Semantic Web technologies should allow federated queries over multiple triple stores. In this paper we compare the efficiency of a set of biologically relevant queries as applied to a number of different triple store implementations.
RESULTS: Previously we developed a library of queries to guide the use of our knowledge base Cell Cycle Ontology implemented as a triple store. We have now compared the performance of these queries on five non-commercial triple stores: OpenLink Virtuoso (Open-Source Edition), Jena SDB, Jena TDB, SwiftOWLIM and 4Store. We examined three performance aspects: the data uploading time, the query execution time and the scalability. The queries we had chosen addressed diverse ontological or biological questions, and we found that individual store performance was quite query-specific. We identified three groups of queries displaying similar behaviour across the different stores: 1) relatively short response time queries, 2) moderate response time queries and 3) relatively long response time queries. SwiftOWLIM proved to be a winner in the first group, 4Store in the second one and Virtuoso in the third one.
CONCLUSIONS: Our analysis showed that some queries behaved idiosyncratically, in a triple store specific manner, mainly with SwiftOWLIM and 4Store. Virtuoso, as expected, displayed a very balanced performance - its load time and its response time for all the tested queries were better than average among the selected stores; it showed a very good scalability and a reasonable run-to-run reproducibility. Jena SDB and Jena TDB were consistently slower than the other three implementations. Our analysis demonstrated that most queries developed for Virtuoso could be successfully used for other implementations.

Entities:  

Mesh:

Year:  2012        PMID: 22373359      PMCID: PMC3471352          DOI: 10.1186/1471-2105-13-S1-S3

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  7 in total

1.  The headache of knowledge management.

Authors:  J Hodgson
Journal:  Nat Biotechnol       Date:  2001-07       Impact factor: 54.908

2.  The semantic web and biology.

Authors:  Tor-Kristian Jenssen; Eivind Hovig
Journal:  Drug Discov Today       Date:  2002-10-01       Impact factor: 7.851

Review 3.  Biological knowledge management: the emerging role of the Semantic Web technologies.

Authors:  Erick Antezana; Martin Kuiper; Vladimir Mironov
Journal:  Brief Bioinform       Date:  2009-05-19       Impact factor: 11.622

4.  The Cell Cycle Ontology: an application ontology for the representation and integrated analysis of the cell cycle process.

Authors:  Erick Antezana; Mikel Egaña; Ward Blondé; Aitzol Illarramendi; Iñaki Bilbao; Bernard De Baets; Robert Stevens; Vladimir Mironov; Martin Kuiper
Journal:  Genome Biol       Date:  2009-05-29       Impact factor: 13.583

5.  Ongoing and future developments at the Universal Protein Resource.

Authors: 
Journal:  Nucleic Acids Res       Date:  2010-11-04       Impact factor: 16.971

Review 6.  Calling International Rescue: knowledge lost in literature and data landslide!

Authors:  Teresa K Attwood; Douglas B Kell; Philip McDermott; James Marsh; Steve R Pettifer; David Thorne
Journal:  Biochem J       Date:  2009-12-10       Impact factor: 3.857

7.  BioGateway: a semantic systems biology tool for the life sciences.

Authors:  Erick Antezana; Ward Blondé; Mikel Egaña; Alistair Rutherford; Robert Stevens; Bernard De Baets; Vladimir Mironov; Martin Kuiper
Journal:  BMC Bioinformatics       Date:  2009-10-01       Impact factor: 3.169

  7 in total
  4 in total

Review 1.  Biological databases for behavioral neurobiology.

Authors:  Erich J Baker
Journal:  Int Rev Neurobiol       Date:  2012       Impact factor: 3.230

2.  BioBenchmark Toyama 2012: an evaluation of the performance of triple stores on biological data.

Authors:  Hongyan Wu; Toyofumi Fujiwara; Yasunori Yamamoto; Jerven Bolleman; Atsuko Yamaguchi
Journal:  J Biomed Semantics       Date:  2014-07-10

3.  TogoTable: cross-database annotation system using the Resource Description Framework (RDF) data model.

Authors:  Shin Kawano; Tsutomu Watanabe; Sohei Mizuguchi; Norie Araki; Toshiaki Katayama; Atsuko Yamaguchi
Journal:  Nucleic Acids Res       Date:  2014-05-14       Impact factor: 16.971

Review 4.  LungMAP: The Molecular Atlas of Lung Development Program.

Authors:  Maryanne E Ardini-Poleske; Robert F Clark; Charles Ansong; James P Carson; Richard A Corley; Gail H Deutsch; James S Hagood; Naftali Kaminski; Thomas J Mariani; Steven S Potter; Gloria S Pryhuber; David Warburton; Jeffrey A Whitsett; Scott M Palmer; Namasivayam Ambalavanan
Journal:  Am J Physiol Lung Cell Mol Physiol       Date:  2017-08-10       Impact factor: 5.464

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.