Literature DB >> 23505292

Brain: biomedical knowledge manipulation.

Samuel Croset1, John P Overington, Dietrich Rebholz-Schuhmann.   

Abstract

SUMMARY: Brain is a Java software library facilitating the manipulation and creation of ontologies and knowledge bases represented with the Web Ontology Language (OWL).
AVAILABILITY AND IMPLEMENTATION: The Java source code and the library are freely available at https://github.com/loopasam/Brain and on the Maven Central repository (GroupId: uk.ac.ebi.brain). The documentation is available at https://github.com/loopasam/Brain/wiki.

Entities:  

Mesh:

Year:  2013        PMID: 23505292      PMCID: PMC3634181          DOI: 10.1093/bioinformatics/btt109

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 MOTIVATION

Knowledge bases, a concept from computer science (Krötzsch for an introduction), could be a solution to improve the interoperability and the value of the large amount of biomedical information available online. At the time of writing, a few options are available to handle such knowledge bases: complex libraries as the Web Ontology Language (OWL)-API (Horridge and Bechhofer, 2011) or didactic graphical user interfaces, such as Protege (http://protege.stanford.edu/) or TopBraid Composer (http://www.topbraidcomposer.com). An intermediary framework, OWLTools (http://code.google.com/p/owltools/), provides some methods to query biomedical ontologies, but the interaction with the library is mostly done via command-lines, which limits the scale of projects that can be built with it. Brain—the Java software library presented in this manuscript addresses these issues and provides a comprehensive and simplified interface, dedicated to the programmatic creation and query of biomedical knowledge bases. The library aims at bridging the gap between graphical user interfaces and the OWL-API and is particularly useful to develop web applications. Brain has a particular focus on the EL profile of OWL, as it covers the majority of biomedical use-cases and unlocks good reasoning performance and scalability.

2 SCALABLE KNOWLEDGE BASES

The OWL derives from description logic and has been designed to capture the knowledge of a domain of interest in the form of a structured vocabulary (http://www.w3.org/TR/owl2-overview/). This feature makes it particularly interesting from the perspective of the life sciences, as a number of ontologies and classification schemes have been developed from the origin of the discipline and can now be converted into an OWL representation. Brain focuses on a particular profile of OWL, called EL, which consists of a subset of the constructs available in the original language (Motik ). This profile is designed to be tractable, meaning that the axioms available have a polynomial complexity and are, therefore, easier to compute than the full version of OWL. Brain primarily supports the OWL 2 EL profile for its computational properties and suitability for real-life biomedical applications, where millions of axioms could be potentially extracted from complex repositories, such as ChEMBL (Gaulton ). Moreover, the EL profile is expressive enough to cover a good portion of biomedical knowledge: most of Open Biomedical Ontologies (OBO), such as the Gene Ontology (GO—Ashburner ) or the Chemical Entities of Biological Interest (ChEBI—De Matos ), are already included in this profile, opening doors to large-scale meaningful data integration. Brain builds on the top of Elk, a fast reasoner dedicated to EL ontologies (Kazakov ). Elk shows good performances at handling large datasets and offers the possibility of running some reasoning tasks in parallel; therefore, clusters or multicore architecture can scale the speed of reasoning as more data are added to the knowledge base. Brain wraps and simplifies the interaction with Elk while still leaving the possibility to fine-tune the configuration for advanced users. Figure 1 shows an example of query using the Elk reasoner.
Fig. 1.

Implementation example in Java of an axiom using Brain; the axiom expressed in natural language: A nucleus is part of some cells. Same axiom described in OWL using the Manchester syntax: Nucleus subClassOf part-of some Cell

Implementation example in Java of an axiom using Brain; the axiom expressed in natural language: A nucleus is part of some cells. Same axiom described in OWL using the Manchester syntax: Nucleus subClassOf part-of some Cell

3 LIBRARY FEATURES

Brain is implemented as a facade leveraging the access to the OWL-API and providing a series of convenience methods for common use-cases encountered in the biomedical domain. To simplify the interaction with the OWL-API, Brain follows a series of features described later in the text. The full list of currently supported constructs and methods is available in the Supplementary Material and in the online documentation.

3.1 Unique knowledge base

An instance of a Brain object holds a reference to only one knowledge base. It is yet possible to import some external ontologies, either stored locally or via a network, but Brain will always merge the added information to the existing knowledge base.

3.2 Unique short form names

The names (short forms) of OWL entities handled by a Brain object have to be unique. It is for instance not possible to add an OWL class, such as http://www.example.org/Cell to the ontology if an OWL entity with the short form ‘Cell’ already exists. Despite being in contradiction with some Semantic Web principles, this design prevents ambiguous queries and hides as much as possible the cumbersome interaction with prefixes and Internationalized Resource Identifiers (IRI).

3.3 Typeless interaction

The interaction with the library relies on the user-friendly Manchester syntax entered as string (Horridge ). This permits moving away from the creation of Java objects and is particularly suitable in a web server set-up where requests are likely to be some typeless characters. Using strings as input also speeds the production and flexibility of the code written, when moving from a relational database to an OWL representation, for example. Figure 1 provides an example of axiom implementation using the Manchester syntax.

3.4 Error-handling

Because the interaction with Brain is built around strings rather than Java objects, special care has to be put on exceptions handling to safely maintain the correct execution of the program. Brain throws different types of error tailored to the operation performed by the user. This feature is mandatory while developing large applications and helps to maintain the consistency of the underlying knowledge base.

3.5 Knowledge integration

An interesting feature brought by the Semantic Web and OWL is the possibility to merge information based on the IRIs of the entities described. The library supports the loading and integration of external knowledge bases, as well as references to external entities. Data from different sources can, therefore, be easily connected and reason over by Brain. The integration of an external knowledge base is shown on Figure 1.

3.6 Querying

Brain is oriented towards efficient querying of OWL 2 EL knowledge bases. This characteristic makes it suitable as a query engine on a web server for answering live queries from users. Powerful questions can be formulated using the Manchester syntax, abstracting away complex interaction with the Java objects provided by the OWL-API (illustrated in Fig. 1). An example of question answering over the GO using Brain is compared with a traditional SQL query in the Supplementary Material.

4 CONCLUSION

Brain is an open source Java library designed to build and query biomedical knowledge bases or OWL ontologies. The library is centered on the EL profile and designed to be suitable and scalable for biomedical knowledge representation. The convenience methods provided by Brain should simplify the development of biomedical knowledge bases and allow developers to increase their productivity while effectively dealing with data integration challenges. Funding: EMBL member states. S.C. is a member of Darwin College, University of Cambridge. Conflict of Interest: none declared.
  3 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  ChEMBL: a large-scale bioactivity database for drug discovery.

Authors:  Anna Gaulton; Louisa J Bellis; A Patricia Bento; Jon Chambers; Mark Davies; Anne Hersey; Yvonne Light; Shaun McGlinchey; David Michalovich; Bissan Al-Lazikani; John P Overington
Journal:  Nucleic Acids Res       Date:  2011-09-23       Impact factor: 16.971

3.  Chemical Entities of Biological Interest: an update.

Authors:  Paula de Matos; Rafael Alcántara; Adriano Dekker; Marcus Ennis; Janna Hastings; Kenneth Haug; Inmaculada Spiteri; Steve Turner; Christoph Steinbeck
Journal:  Nucleic Acids Res       Date:  2009-10-23       Impact factor: 16.971

  3 in total
  4 in total

1.  The functional therapeutic chemical classification system.

Authors:  Samuel Croset; John P Overington; Dietrich Rebholz-Schuhmann
Journal:  Bioinformatics       Date:  2013-10-30       Impact factor: 6.937

2.  Linking tissues to phenotypes using gene expression profiles.

Authors:  Anika Oellrich; Damian Smedley
Journal:  Database (Oxford)       Date:  2014-03-13       Impact factor: 3.451

3.  An ontology approach to comparative phenomics in plants.

Authors:  Anika Oellrich; Ramona L Walls; Ethalinda Ks Cannon; Steven B Cannon; Laurel Cooper; Jack Gardiner; Georgios V Gkoutos; Lisa Harper; Mingze He; Robert Hoehndorf; Pankaj Jaiswal; Scott R Kalberer; John P Lloyd; David Meinke; Naama Menda; Laura Moore; Rex T Nelson; Anuradha Pujar; Carolyn J Lawrence; Eva Huala
Journal:  Plant Methods       Date:  2015-02-25       Impact factor: 4.993

4.  The role of ontologies in biological and biomedical research: a functional perspective.

Authors:  Robert Hoehndorf; Paul N Schofield; Georgios V Gkoutos
Journal:  Brief Bioinform       Date:  2015-04-10       Impact factor: 11.622

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.