Literature DB >> 24064416

BioServices: a common Python package to access biological Web Services programmatically.

Thomas Cokelaer¹, Dennis Pultz, Lea M Harder, Jordi Serra-Musach, Julio Saez-Rodriguez.

Abstract

MOTIVATION: Web interfaces provide access to numerous biological databases. Many can be accessed to in a programmatic way thanks to Web Services. Building applications that combine several of them would benefit from a single framework.
RESULTS: BioServices is a comprehensive Python framework that provides programmatic access to major bioinformatics Web Services (e.g. KEGG, UniProt, BioModels, ChEMBLdb). Wrapping additional Web Services based either on Representational State Transfer or Simple Object Access Protocol/Web Services Description Language technologies is eased by the usage of object-oriented programming.
AVAILABILITY AND IMPLEMENTATION: BioServices releases and documentation are available at http://pypi.python.org/pypi/bioservices under a GPL-v3 license.

Entities: Species

Mesh：

Year: 2013 PMID： 24064416 PMCID： PMC3842755 DOI： 10.1093/bioinformatics/btt547

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 INTRODUCTION AND MOTIVATION

Many biological databases are accessible on the www (world wide web) via server-side applications that span the entire spectrum of bioinformatics (e.g. genomics, sequence analysis). Although manual requests allow quick retrieval of information, programmatic access via Web Services scales up the number of requests and permits the composition of complex workflows. One strength of Web Services is that client-side applications do not need any intimate knowledge of the database provided by the service itself. Life sciences and bioinformatics have had a fecund production of Web Services in recent years (Bhagat ). Web services integration within a single framework fosters the development of applications. An example based on JAVA is MAPI (Karlsson and Trelles, 2013) that has been a base for developing biomedical applications. Programmatic access to Web Services relies mostly on (i) REST (Representational State Transfer) and (ii) SOAP (Simple Object Access Protocol; www.w3.org/TR/soap). REST has an emphasis on readability: each resource corresponds to a unique URL. There is no need for any external dependency, as operations are carried out via standard Hypertext Transfer Protocol (HTTP) methods (e.g. GET, POST). SOAP uses extensible mark-up language (XML)-based messaging protocol to encode request and response messages using WSDL (Web Services Description Language; www.w3.org/TR/wsdl) to describe the service’s capabilities. To build applications that integrate several Web Services, one needs to have expertise in (i) HTTP requests, (ii) SOAP protocol, (iii) REST protocol, (iv) XML parsing to consume the XML messages and (v) related bioinformatics fields. Besides, inputs and outputs of the services can be heterogeneous. Consequently, the composition of workflows or design of external applications based on several Web Services can be challenging. The Python language has many useful features for researchers (Bassi 2007): it is an object-oriented language with a precise and concise syntax and has a versatile set of standard modules. There is a growing and thriving community of scientific developers. An example of a library dedicated to bioinformatics is BioPython (Cock ). It provides input/output functions, algorithms and some access to Web Services (e.g. Entrez). However, a dedicated framework to easily integrate bioinformatics Web Services and to provide extensive access to them is missing. We have, therefore, developed BioServices to provide programmatic access to major bioinformatics Web Services within a single software framework using Python as a glue language. It should alleviate the needs for technical knowledge to develop more complex applications around existing resources.

2 APPROACH AND IMPLEMENTATION

To bring together various Web Services within BioServices, we first designed two base classes called RESTService and WSDLService so as to ease the wrapping of Web Services. As shown in Figure 1, these two classes are then used by all services available within BioServices. A SOAP/WSDL Web Service can be wrapped concisely as follows:

Fig. 1.

Interaction between external applications and existing Web Services via BioServices. External applications can use BioServices to compose or aggregate several Web Services (see Table 1 for available services)

from bioservices import WSDLService class AWrapper(WSDLService): def __init__(self): super(AWrapper, self).__init__( "AWrapper", url="validURL?wsdl”) Interaction between external applications and existing Web Services via BioServices. External applications can use BioServices to compose or aggregate several Web Services (see Table 1 for available services)

Table 1.

Web Services accessible from BioServices

ArrayExpress (R)	BioMart (R)	BioModels (W)
ChEBI (W)	ChEMBLdb (R)	EUtils (W)
KEGG (R)	HGNC (R)	Miriam (W)
PDB (R)	PICR (R)	PSICQUIC (R)
QuickGO (R)	Rhea (R)	UniChem (R)
UniProt (R)	NCBIBlast (R)	WikiPathways (W)

Note: R stands for REST and W stands for SOAP/WSDL protocol.

Similarly, REST services can be exposed concisely (replace WSDLService by RESTService) as explained in the Developer Section of the Supplementary Data. An example of SOAP/WSDL service wrapped within BioServices is BioModels (Li, 2010). Consider the following example: 1 from bioservices import BioModels 2 s = BioModels() 3 s.methods # methods exposed by WSDL 4 s.serv.getAllModelsId() 5 s.getAllModelsId() All methods exposed by the service are listed in the methods attribute (line 3). They can be called directly via the serv attribute. For example, all model identifiers can be retrieved (line 4). Methods are then wrapped (line 5) to add robustness and quality. Web Services currently available in BioServices (see Table 1) can be used independently but they can also be combined. Amongst the various examples provided in the Supplementary Data, a case study demonstrates how to retrieve a protein’s UniProt identifier, its corresponding FASTA sequence, the related Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, the interactions with other proteins (PSICQUIC) and so forth. Web Services accessible from BioServices Note: R stands for REST and W stands for SOAP/WSDL protocol. Two issues arise when manipulating several services, especially for end-users: (i) heterogeneous data structures are returned and (ii) a plethora of identifiers and keywords are required. Both issues are unfortunately inherent to the diversity of the Web Services used. Although some data structures are commonly used (e.g. XML format), there is still a variety of data structures to deal with. BioServices addresses the first issue by providing extensive documentation and examples. As for the identifiers issue, although BioServices does not provide mapping functions by itself, it gives access to mapping functions from UniProt, KEGG and UniChem (among others). See the online documentation (http://pypi.python.org/pypi/bioservices) for examples.

3 CONCLUSION/RESULTS

BioServices provides a comprehensive access to bioinformatics Web Services within a single Python library; the current release (1.1.1) provides access to 18 Web Services (see Table 1). The methodology used to encapsulate Web Services and their functionalities combined with Python allow pipelines (that combine several Web Services) to be implemented concisely. Besides, an extensive online documentation (http://pypi.python.org/pypi/bioservices) should help users and developers to deal with the profusion of identifiers and data structures inherent to the diversity of Web Services available. Releases are available on PyPi (http://pypi.python.org/pypi/bioservices), the official Python repository. Developers can obtain the source code from a public server (https://www.assembla.com/spaces/bioservices/wiki). Besides, bug reports and new feature requests are encouraged (https://www.assembla.com/spaces/bioservices/tickets), and contributors are welcome to join the user and developer community (https://www.assembla.com/spaces/bioservices/wiki). Tests are included with a large coverage to guarantee robustness regarding potential modifications of the Web Services themselves. By covering a wide range of Web Services, BioServices can be used to complement external libraries (e.g. BioPython, Galaxy; see Supplementary Data) and foster the development of new workflows. Funding: Danish Research Councils (to L.M.H. and D.P.), Lundbeck Foundation (to L.M.H.), Foundation Ferran Sunyer i Balaguer (to J.S.M.), Biomedical Research Institute of Girona (to J.S.M.) and EU BioPreDyn FP7-KBBE (grant 289434). Conflict of Interest: none declared.

5 in total

1. BioModels.net Web Services, a free and integrated toolkit for computational modelling software.

Authors: Chen Li; Mélanie Courtot; Nicolas Le Novère; Camille Laibe
Journal: Brief Bioinform Date: 2009-11-25 Impact factor: 11.622

2. Biopython: freely available Python tools for computational molecular biology and bioinformatics.

Authors: Peter J A Cock; Tiago Antao; Jeffrey T Chang; Brad A Chapman; Cymon J Cox; Andrew Dalke; Iddo Friedberg; Thomas Hamelryck; Frank Kauff; Bartek Wilczynski; Michiel J L de Hoon
Journal: Bioinformatics Date: 2009-03-20 Impact factor: 6.937

3. BioCatalogue: a universal catalogue of web services for the life sciences.

Authors: Jiten Bhagat; Franck Tanoh; Eric Nzuobontane; Thomas Laurent; Jerzy Orlowski; Marco Roos; Katy Wolstencroft; Sergejs Aleksejevs; Robert Stevens; Steve Pettifer; Rodrigo Lopez; Carole A Goble
Journal: Nucleic Acids Res Date: 2010-05-19 Impact factor: 16.971

Review 4. A primer on python for life science researchers.

Authors: S Bassi
Journal: PLoS Comput Biol Date: 2007-11 Impact factor: 4.475

5. MAPI: a software framework for distributed biomedical applications.

Authors: Johan Karlsson; Oswaldo Trelles
Journal: J Biomed Semantics Date: 2013-01-11

5 in total

31 in total

1. Sequana coverage: detection and characterization of genomic variations using running median and mixture models.

Authors: Dimitri Desvillechabrol; Christiane Bouchier; Sean Kennedy; Thomas Cokelaer
Journal: Gigascience Date: 2018-12-01 Impact factor: 6.524

2. WikiNetworks: translating manually created biological pathways for topological analysis.

Authors: Mukta G Palshikar; Shannon P Hilchey; Martin S Zand; Juilee Thakar
Journal: Bioinformatics Date: 2021-10-12 Impact factor: 6.931

3. New Insights Into Cinnamoyl Esterase Activity of Oenococcus oeni.

Authors: Ingrid Collombel; Chrats Melkonian; Douwe Molenaar; Francisco M Campos; Tim Hogg
Journal: Front Microbiol Date: 2019-11-08 Impact factor: 5.640

4. ProteomeVis: a web app for exploration of protein properties from structure to sequence evolution across organisms' proteomes.

Authors: Rostam M Razban; Amy I Gilson; Niamh Durfee; Hendrik Strobelt; Kasper Dinkla; Jeong-Mo Choi; Hanspeter Pfister; Eugene I Shakhnovich
Journal: Bioinformatics Date: 2018-10-15 Impact factor: 6.937

5. Introducing the PRIDE Archive RESTful web services.

Authors: Florian Reisinger; Noemi del-Toro; Tobias Ternent; Henning Hermjakob; Juan Antonio Vizcaíno
Journal: Nucleic Acids Res Date: 2015-04-22 Impact factor: 16.971

6. Rapid development of entity-based data models for bioinformatics with persistence object-oriented design and structured interfaces.

Authors: Elishai Ezra Tsur
Journal: BioData Min Date: 2017-03-11 Impact factor: 2.522

7. Purifying Selection on Exonic Splice Enhancers in Intronless Genes.

Authors: Rosina Savisaar; Laurence D Hurst
Journal: Mol Biol Evol Date: 2016-01-23 Impact factor: 16.240

8. A Multi-scale Computational Platform to Mechanistically Assess the Effect of Genetic Variation on Drug Responses in Human Erythrocyte Metabolism.

Authors: Nathan Mih; Elizabeth Brunk; Aarash Bordbar; Bernhard O Palsson
Journal: PLoS Comput Biol Date: 2016-07-28 Impact factor: 4.475

9. Transcriptator: An Automated Computational Pipeline to Annotate Assembled Reads and Identify Non Coding RNA.

Authors: Kumar Parijat Tripathi; Daniela Evangelista; Antonio Zuccaro; Mario Rosario Guarracino
Journal: PLoS One Date: 2015-11-18 Impact factor: 3.240

10. Comparative phosphoproteomic analysis reveals signaling networks regulating monopolar and bipolar cytokinesis.

Authors: Özge Karayel; Erdem Şanal; Sven H Giese; Zeynep Cansu Üretmen Kagıalı; Ayşe Nur Polat; Chi-Kuo Hu; Bernhard Y Renard; Nurcan Tuncbag; Nurhan Özlü
Journal: Sci Rep Date: 2018-02-02 Impact factor: 4.379