Literature DB >> 28792687

OLS Client and OLS Dialog: Open Source Tools to Annotate Public Omics Datasets.

Yasset Perez-Riverol1, Tobias Ternent1, Maximilian Koch1, Harald Barsnes2,3, Olga Vrousgou1, Simon Jupp1, Juan Antonio Vizcaíno1.   

Abstract

The availability of user-friendly software to annotate biological datasets and experimental details is becoming essential in data management practices, both in local storage systems and in public databases. The Ontology Lookup Service (OLS, http://www.ebi.ac.uk/ols) is a popular centralized service to query, browse and navigate biomedical ontologies and controlled vocabularies. Recently, the OLS framework has been completely redeveloped (version 3.0), including enhancements in the data model, like the added support for Web Ontology Language based ontologies, among many other improvements. However, the new OLS is not backwards compatible and new software tools are needed to enable access to this widely used framework now that the previous version is no longer available. We here present the OLS Client as a free, open-source Java library to retrieve information from the new version of the OLS. It enables rapid tool creation by providing a robust, pluggable programming interface and common data model to programmatically access the OLS. The library has already been integrated and is routinely used by several bioinformatics resources and related data annotation tools. Secondly, we also introduce an updated version of the OLS Dialog (version 2.0), a Java graphical user interface that can be easily plugged into Java desktop applications to access the OLS. The software and related documentation are freely available at https://github.com/PRIDE-Utilities/ols-client and https://github.com/PRIDE-Toolsuite/ols-dialog.
© 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

Entities:  

Keywords:  data annotation; omics datasets; ontologies; open source software

Mesh:

Year:  2017        PMID: 28792687      PMCID: PMC5707441          DOI: 10.1002/pmic.201700244

Source DB:  PubMed          Journal:  Proteomics        ISSN: 1615-9853            Impact factor:   3.984


Modern systems biology and bioinformatics approaches rely on the integration of large amounts of potentially disparate data, coming from multiple biological samples, being generated by different techniques (e.g. omics approaches) and instrumentation.1 The management, integration, and reuse of data require an accurate and comprehensive capture of the associated metadata, including details such as the description of the samples, the experimental design, the processing steps, and the new biological evidences and claims, among others.2 Therefore, proper and consistent annotation of the generated data is essential in order to make sense of all the information. Ontologies and controlled vocabularies (CVs) have demonstrated their usefulness in enabling the consistent annotation of large volumes of complex data in the life sciences.3 Note that in the following text, we will for simplicity use the term “ontology” to refer to both “ontologies” and “CVs”. Since 2006, the Ontology Lookup Service (OLS, http://www.ebi.ac.uk/ols) at the European Bioinformatics Institute (EMBL‐EBI) has provided a popular and centralized framework to query, browse and navigate biomedical ontologies in the obo format,4 removing the need to search individual websites (for particular ontologies) or having to parse flat‐files available elsewhere. The ontologies are maintained by domain experts in the respective fields.3 The new iteration of the OLS (version 3.0, originally released on May 2016) constitutes a complete redevelopment of the framework, including improvements in the data model e.g. added support for Web Ontology Language (OWL)‐based ontologies, an increase in the number of ontologies covered, as well as multiple enhancements in its web and programmatic interfaces, and in the underlying backend. By June 2017 the OLS integrates 191 ontologies, which correspond to around 4.9 million unique ontology terms. The biggest change in the new OLS has been to move from an eXtensible Markup Language based Simple Object Access Protocol Application Program Interface (API) to a JavaScript Object Notation based REpresentational State Transfer one that requires changes to the client application written for the old system. Here we present the OLS Client, an open source Java API, providing a comprehensive functionality to programmatically query, browse and retrieve all information from the OLS. It can handle all the main data types in the OLS, ranging from ontology terms and annotations to graph ontology term relationships such as child and parent terms. In addition, we present an update of the OLS Dialog (version 2.0), a Java graphical user interface (GUI) built on top of the OLS Client that can be easily plugged into Java desktop applications. To the best of our knowledge, OLS Client and OLS Dialog 2.0 are the first open source Java APIs available for the new version of the OLS. The OLS Client library (https://github.com/PRIDE-Utilities/ols-client) provides a unified access interface to ontologies and all related information in the OLS (Figure 1). The interface provides methods to access and retrieve information on each ontology term including annotations, synonyms and all types of identifiers. The API is identifier independent. This represents one of the most convenient functionalities of the OLS Client given that each term can be directly retrieved using their corresponding identifiers, including the Uniform Resource Identifier (URI) or Compact URI (CURIE). Following the OLS 3.0 data structure, the OLS Client data model represents a well‐connected graph where every term in each ontology contains a list of parent and child terms. The API is composed of two functional components: (i) the data models, incorporating all data structures (terms, identifiers, ontologies, annotations and synonyms); and (ii) the OLS Client query interface, providing a set of functionalities to query, browse and retrieve the ontology's related information. The data models are used by the query interface to represent the retrieved data, as shown in the example code snippet below:
Figure 1

Overview of the design of the OLS Client: (A) Graph structure of the ontology terms relations for EFO Ontology and (B) Data structure of OLS‐Client library to represent and handle the OLS information.

OLSClientolsClient = new OLSClient(new OLSWsConfigProd()); Term term = olsClient.getTermById(new Identifier(“MS:1001767”, Identifier.IdentifierType.OBO), “MS”). Overview of the design of the OLS Client: (A) Graph structure of the ontology terms relations for EFO Ontology and (B) Data structure of OLS‐Client library to represent and handle the OLS information. Efficient mining of millions of ontology terms: Each ontology can potentially contain thousands of terms in well‐connected graphs. This complexity prompts a new challenge when users want to mine one specific ontology. Then, OLS Client implements an efficient mechanism to retrieve all terms in an ontology graph using pagination, and recursive querying of the OLS web API. Every time a specific ontology term is requested, the complete list of all the identifiers of all its child terms is retrieved. In addition, the API can obtain recursively all the term's information split into chunks. Furthermore, the OLS Client provides simple search functionality of the OLS, as shown in the example code snippet below and in the following section: OLSClientolsClient = new OLSClient(new OLSWsConfigProd()); List terms = olsClient.getTermsByName(“modification”, “ms”, true). OLS Dialog 2.0. A new version of the OLS Dialog5 (https://github.com/PRIDE-Toolsuite/ols-dialog) has also been implemented, providing a GUI that can be plugged into any Java standalone application used for data annotation purposes. The OLS Dialog greatly simplifies the usage of the OLS Client since it does not require any additional knowledge about the OLS web services or the various ontology data formats. The OLS Dialog provides four different search strategies (Figure 2): (i) “Term Name Search”, which locates a term by a (partial) match to the searched text; (ii) “Term ID Search”, which selects a term by its accession number; (iii) “Browse Ontology” enables users to browse the selected ontology tree structure to select the desired term; and (iv) the “PSI‐MOD Mass Search”, which uses the PSI‐MOD ontology 6 and UNIMOD to select protein modifications using the delta mass corresponding to a given modification (Fig. 2). The former more specific search functionality is used for instance in the ProteomeXchange (PX) Submission tool,7 used by most submitters to the widely‐used PRIDE database (as part of PX). In all cases, when users select a concrete term, all the associated details will be presented in the “Term Details” table in the GUI, including its identifier, annotations and synonyms. The OLS Client is part of PRIDE‐Utilities8 and OLS Dialog is part of the PRIDE Inspector Toolsuite9 a set of Java components that can be reuse in proteomics Java applications.
Figure 2

OLS Dialog main interfaces: (A) Search functionalities include searching by the name of the ontology term, the identifier of the term, or the PTM delta mass; while (B) Browse Ontology enables browsing across all the terms in the OLS to locate the desired term.

OLS Dialog main interfaces: (A) Search functionalities include searching by the name of the ontology term, the identifier of the term, or the PTM delta mass; while (B) Browse Ontology enables browsing across all the terms in the OLS to locate the desired term. In conclusion, it is important to note that, at the moment of writing, several popular resources and tools are already using OLS Client as annotation source, including resources such as IntAct,10 OmicsDI,11 BioModels,12 and the Reactome Pathway Annotation Tool,13 or stand‐alone tools (which use OLS Dialog on top) such as PeptideShaker,14 the Laboratory Information Management System (LIMS) colims (https://github.com/compomics/colims) or the already mentioned PX Submission Tool (for a complete list see Table 1). The widespread use of the library ensures its stability, continued development, and community support. The OLS Client library and OLS Dialog (including the related documentation) are freely available and released under Apache 2.0 license at https://github.com/PRIDE-Utilities/ols-client and https://github.com/PRIDE-Toolsuite/ols-dialog, respectively.
Table 1

Software and resources using OLS Client and/or OLS Dialog (by June 2017)

NameDescriptionURLTools
ProteomeXchange Submission Tool7 Stand‐alone submission tool for the PRIDE database https://github.com/proteomexchange/px-submission-tool

OLS Client

OLS Dialog

Reactome Annotation Tool13 Pathway annotation tool http://www.reactome.org OLS Client
IntAct10 Curated molecular interactions database http://www.ebi.ac.uk/intact OLS Client
PeptideShaker14 Search engine independent platform for the interpretation of proteomics identification results http://compomics.github.io/projects/peptide-shaker OLS Client
Omics Discovery Index11 A multi‐omics dataset discovery resource http://www.omicsdi.org OLS Client
ColimsA LIMS system to automate and expedite proteomics data management, processing and analysis http://compomics.github.io/projects/colims

OLS Client

OLS Dialog

BioSamples(7)Resource that stores and supplies descriptions and metadata about biological samples https://www.ebi.ac.uk/biosamples OLS Client
CySBML(8)Cytoscape plugin for importing and visualizing SBML annotations https://sourceforge.net/projects/cysbml OLS Client
BioModels Database12 BioModels Database is a repository of computational models of biological processes. http://www.ebi.ac.uk/biomodels-main OLS Client
Software and resources using OLS Client and/or OLS Dialog (by June 2017) OLS Client OLS Dialog OLS Client OLS Dialog application programming interface Compact URI controlled vocabulary graphical user interface Laboratory Information Management System Ontology Lookup Service Web Ontology Language ProteomeXchange Uniform Resource Identifier

Conflict of Interest

The authors have declared no conflict of interest.
  14 in total

1.  The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration.

Authors:  Barry Smith; Michael Ashburner; Cornelius Rosse; Jonathan Bard; William Bug; Werner Ceusters; Louis J Goldberg; Karen Eilbeck; Amelia Ireland; Christopher J Mungall; Neocles Leontis; Philippe Rocca-Serra; Alan Ruttenberg; Susanna-Assunta Sansone; Richard H Scheuermann; Nigam Shah; Patricia L Whetzel; Suzanna Lewis
Journal:  Nat Biotechnol       Date:  2007-11       Impact factor: 54.908

2.  An ontology for healthcare quality indicators: challenges for semantic interoperability.

Authors:  Pam White; Abdul Roudsari
Journal:  Stud Health Technol Inform       Date:  2015

3.  PeptideShaker enables reanalysis of MS-derived proteomics data sets.

Authors:  Marc Vaudel; Julia M Burkhart; René P Zahedi; Eystein Oveland; Frode S Berven; Albert Sickmann; Lennart Martens; Harald Barsnes
Journal:  Nat Biotechnol       Date:  2015-01       Impact factor: 54.908

4.  OLS dialog: an open-source front end to the ontology lookup service.

Authors:  Harald Barsnes; Richard G Côté; Ingvar Eidhammer; Lennart Martens
Journal:  BMC Bioinformatics       Date:  2010-01-17       Impact factor: 3.169

5.  Discovering and linking public omics data sets using the Omics Discovery Index.

Authors:  Yasset Perez-Riverol; Mingze Bai; Felipe da Veiga Leprevost; Silvano Squizzato; Young Mi Park; Kenneth Haug; Adam J Carroll; Dylan Spalding; Justin Paschall; Mingxun Wang; Noemi Del-Toro; Tobias Ternent; Peng Zhang; Nicola Buso; Nuno Bandeira; Eric W Deutsch; David S Campbell; Ronald C Beavis; Reza M Salek; Ugis Sarkans; Robert Petryszak; Maria Keays; Eoin Fahy; Manish Sud; Shankar Subramaniam; Ariana Barbera; Rafael C Jiménez; Alexey I Nesvizhskii; Susanna-Assunta Sansone; Christoph Steinbeck; Rodrigo Lopez; Juan A Vizcaíno; Peipei Ping; Henning Hermjakob
Journal:  Nat Biotechnol       Date:  2017-05-09       Impact factor: 54.908

6.  The Ontology Lookup Service, a lightweight cross-platform tool for controlled vocabulary queries.

Authors:  Richard G Côté; Philip Jones; Rolf Apweiler; Henning Hermjakob
Journal:  BMC Bioinformatics       Date:  2006-02-28       Impact factor: 3.169

Review 7.  Making proteomics data accessible and reusable: current state of proteomics databases and repositories.

Authors:  Yasset Perez-Riverol; Emanuele Alpi; Rui Wang; Henning Hermjakob; Juan Antonio Vizcaíno
Journal:  Proteomics       Date:  2015-03       Impact factor: 3.984

8.  ms-data-core-api: an open-source, metadata-oriented library for computational proteomics.

Authors:  Yasset Perez-Riverol; Julian Uszkoreit; Aniel Sanchez; Tobias Ternent; Noemi Del Toro; Henning Hermjakob; Juan Antonio Vizcaíno; Rui Wang
Journal:  Bioinformatics       Date:  2015-04-24       Impact factor: 6.937

9.  BioModels: Content, Features, Functionality, and Use.

Authors:  N Juty; R Ali; M Glont; S Keating; N Rodriguez; M J Swat; S M Wimalaratne; H Hermjakob; N Le Novère; C Laibe; V Chelliah
Journal:  CPT Pharmacometrics Syst Pharmacol       Date:  2015-02-26

10.  The Reactome pathway Knowledgebase.

Authors:  Antonio Fabregat; Konstantinos Sidiropoulos; Phani Garapati; Marc Gillespie; Kerstin Hausmann; Robin Haw; Bijay Jassal; Steven Jupe; Florian Korninger; Sheldon McKay; Lisa Matthews; Bruce May; Marija Milacic; Karen Rothfels; Veronica Shamovsky; Marissa Webber; Joel Weiser; Mark Williams; Guanming Wu; Lincoln Stein; Henning Hermjakob; Peter D'Eustachio
Journal:  Nucleic Acids Res       Date:  2015-12-09       Impact factor: 16.971

View more
  7 in total

1.  The European Bioinformatics Institute in 2017: data coordination and integration.

Authors:  Charles E Cook; Mary T Bergman; Guy Cochrane; Rolf Apweiler; Ewan Birney
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

2.  The PRIDE database and related tools and resources in 2019: improving support for quantification data.

Authors:  Yasset Perez-Riverol; Attila Csordas; Jingwen Bai; Manuel Bernal-Llinares; Suresh Hewapathirana; Deepti J Kundu; Avinash Inuganti; Johannes Griss; Gerhard Mayer; Martin Eisenacher; Enrique Pérez; Julian Uszkoreit; Julianus Pfeuffer; Timo Sachsenberg; Sule Yilmaz; Shivani Tiwary; Jürgen Cox; Enrique Audain; Mathias Walzer; Andrew F Jarnuczak; Tobias Ternent; Alvis Brazma; Juan Antonio Vizcaíno
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

3.  Eleven quick tips to build a usable REST API for life sciences.

Authors:  Aleksandra Tarkowska; Denise Carvalho-Silva; Charles E Cook; Edd Turner; Robert D Finn; Andrew D Yates
Journal:  PLoS Comput Biol       Date:  2018-12-13       Impact factor: 4.475

Review 4.  A proteomics sample metadata representation for multiomics integration and big data analysis.

Authors:  Chengxin Dai; Anja Füllgrabe; Julianus Pfeuffer; Elizaveta M Solovyeva; Jingwen Deng; Pablo Moreno; Selvakumar Kamatchinathan; Deepti Jaiswal Kundu; Nancy George; Silvie Fexova; Björn Grüning; Melanie Christine Föll; Johannes Griss; Marc Vaudel; Enrique Audain; Marie Locard-Paulet; Michael Turewicz; Martin Eisenacher; Julian Uszkoreit; Tim Van Den Bossche; Veit Schwämmle; Henry Webel; Stefan Schulze; David Bouyssié; Savita Jayaram; Vinay Kumar Duggineni; Patroklos Samaras; Mathias Wilhelm; Meena Choi; Mingxun Wang; Oliver Kohlbacher; Alvis Brazma; Irene Papatheodorou; Nuno Bandeira; Eric W Deutsch; Juan Antonio Vizcaíno; Mingze Bai; Timo Sachsenberg; Lev I Levitsky; Yasset Perez-Riverol
Journal:  Nat Commun       Date:  2021-10-06       Impact factor: 14.919

5.  iProX: an integrated proteome resource.

Authors:  Jie Ma; Tao Chen; Songfeng Wu; Chunyuan Yang; Mingze Bai; Kunxian Shu; Kenli Li; Guoqing Zhang; Zhong Jin; Fuchu He; Henning Hermjakob; Yunping Zhu
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

6.  OPA1: 516 unique variants and 831 patients registered in an updated centralized Variome database.

Authors:  Bastien Le Roux; Guy Lenaers; Xavier Zanlonghi; Patrizia Amati-Bonneau; Floris Chabrun; Thomas Foulonneau; Angélique Caignard; Stéphanie Leruez; Philippe Gohier; Vincent Procaccio; Dan Milea; Johan T den Dunnen; Pascal Reynier; Marc Ferré
Journal:  Orphanet J Rare Dis       Date:  2019-09-10       Impact factor: 4.123

7.  The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences.

Authors:  Yasset Perez-Riverol; Jingwen Bai; Chakradhar Bandla; David García-Seisdedos; Suresh Hewapathirana; Selvakumar Kamatchinathan; Deepti J Kundu; Ananth Prakash; Anika Frericks-Zipper; Martin Eisenacher; Mathias Walzer; Shengbo Wang; Alvis Brazma; Juan Antonio Vizcaíno
Journal:  Nucleic Acids Res       Date:  2022-01-07       Impact factor: 16.971

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.