Literature DB >> 23671334

A new reference implementation of the PSICQUIC web service.

Noemi del-Toro1, Marine Dumousseau, Sandra Orchard, Rafael C Jimenez, Eugenia Galeota, Guillaume Launay, Johannes Goll, Karin Breuer, Keiichiro Ono, Lukasz Salwinski, Henning Hermjakob.   

Abstract

The Proteomics Standard Initiative Common QUery InterfaCe (PSICQUIC) specification was created by the Human Proteome Organization Proteomics Standards Initiative (HUPO-PSI) to enable computational access to molecular-interaction data resources by means of a standard Web Service and query language. Currently providing >150 million binary interaction evidences from 28 servers globally, the PSICQUIC interface allows the concurrent search of multiple molecular-interaction information resources using a single query. Here, we present an extension of the PSICQUIC specification (version 1.3), which has been released to be compliant with the enhanced standards in molecular interactions. The new release also includes a new reference implementation of the PSICQUIC server available to the data providers. It offers augmented web service capabilities and improves the user experience. PSICQUIC has been running for almost 5 years, with a user base growing from only 4 data providers to 28 (April 2013) allowing access to 151 310 109 binary interactions. The power of this web service is shown in PSICQUIC View web application, an example of how to simultaneously query, browse and download results from the different PSICQUIC servers. This application is free and open to all users with no login requirement (http://www.ebi.ac.uk/Tools/webservices/psicquic/view/main.xhtml).

Entities:  

Mesh:

Year:  2013        PMID: 23671334      PMCID: PMC3977660          DOI: 10.1093/nar/gkt392

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

One of the main issues currently facing the scientific community is the integration of data generated by the many different instruments and software platforms used in high-throughput experiments. The Human Proteome Organization Proteomics Standards Initiative (HUPO-PSI) was founded with the aim of developing standards to unify the diversity of data produced by proteomics experiments (1). In 2004, the Molecular Interaction (MI) group of the PSI jointly published a community-standard XML data model for the representation and exchange of protein-interaction data (2). The same work group subsequently published the Minimum Information about a Molecular Interaction Experiment (MIMIx) (3) guidelines, defining a list of parameters to be supplied when describing experimental molecular-interaction data in a journal publication. A number of public interaction databases have gone still further, forming the International Molecular Exchange consortium (IMEx) (4,5) to facilitate assembly of a single non-redundant set of consistently curated protein-interaction data. This original XML model was later further refined and in 2007 was supplemented by a simple tab delimited format, PSI-MITAB (6). These formats have been widely adopted by molecular-interaction databases (7), enabling the initial development of the Proteomics Standard Initiative Common QUery InterfaCe (PSICQUIC) service. The PSICQUIC (8) specification is defined by means of a Web Service, with a clean-cut set of methods that have as input a query in MIQL (Molecular Interactions Query Language). The initial release of PSICQUIC supported only the very limited set of 15 fields of PSI-MITAB 2.5, which represented a simplistic description of molecular-interaction data. Following the development of extended PSI-MITAB formats (6) (version 2.6 and more recently 2.7), the number of fields has been increased to be fully compliant with MIMIx and enable presentation of IMEx-standard curated data (4). All these changes have resulted in the development of a new PSICQUIC specification encompassing extensions to the MIQL query language and a completely new implementation of the PSICQUIC reference server. Accompanying documentation helps data suppliers to easily migrate to the new, more efficient and feature-rich server.

MATERIALS AND METHODS

PSICQUIC specification

PSICQUIC defines a minimum set of standard SOAP and REST methods to be implemented by every molecular-interaction provider. These methods accept a MIQL query as input and return, as output, molecular-interaction information in one of the standard formats (PSI-XML 2.5, PSI-MITAB 2.5, PSI-MITAB 2.6, PSI-MITAB 2.7). The PSICQUIC SOAP-based services are defined through a standard WSDL specification that all implementations must comply with. This definition has remained stable since PSICQUIC specification version 1.1; however, the capability to return the new PSI-MITAB versions has been added. Among the various methods described in the specification, the most flexible one is the getByQuery. It can be used to perform rather complex queries, as it accepts all the fields defined in MIQL. The results are returned, as specified by the user, in one of the standard formats (PSI-XML or PSI-MITAB). The remaining SOAP methods do not directly use MIQL. A summary of the main methods is shown in Table 1 and Table 2, and further information about their different options is available in the PSICQUIC specification for SOAP available in the PSICQUIC project web (http://code.google.com/p/psicquic/wiki/PsicquicSpec_1_3_Soap).
Table 1.

Summary of the main available methods in SOAP

Method nameDescription
getByQueryRetrieve data using an MIQL query
getByInteractorRetrieve data using a specific participant identifier (equivalent MIQL field identifier)
getByInteractorListRetrieve data using a list of participant identifiers. This method can be used to retrieve interactions where the two or more participants passed as arguments are found. To do so, set the operand to AND.
getByInteractionRetrieve a specific interaction using its identifier (equivalent MIQL field interaction_id)
getByInteractionListRetrieve a list of interactions, using the identifiers
Table 2.

SOAP methods to retrieve information about the service itself (metadata)

Method nameDescription
getSupportedReturn TypesReturns the list of possible formats for the retrieved data
getVersionGets the version of the service
getPropertiesRetrieve a list of the property objects defined in a service by the provider. Each property object have a key and a value
getPropertyRetrieve a property from the service
Summary of the main available methods in SOAP SOAP methods to retrieve information about the service itself (metadata) In addition to the SOAP-based protocol, PSICQUIC also implements a set of RESTful services to make it possible to retrieve data over HTTP using simple URLs. This protocol can also be used to access molecular interactions through scripting languages and supports other common output formats such as Resource Description Framework (RDF), Biological Pathway Exchange (BioPAX) and eXtensible Graph Markup and Modeling Language (XGMML). It should be noticed that these new formats have existed since PSICQUIC specification version 1.2, and they are only available through the REST service. As in SOAP, there is an ample set of the methods to choose from. Figure 1 demonstrates how to access the information by means of HTTP GET requests. A template URL describes the main methods with the different options and the outputs. In the PSICQUIC specification for REST available in (http://code.google.com/p/psicquic/wiki/PsicquicSpec_1_3_Rest), a more extensive version of the methods and options is presented.
Figure 1.

Structure of the URL to fetch data from PSICQUIC service.

Structure of the URL to fetch data from PSICQUIC service.

MIQL

The main input to the PSICQUIC web service is a query written in the MIQL query language. MIQL defines a set of standard fields to query molecular-interaction data, extending the syntax of the Apache Lucene query language on which it is based. MIQL has also been updated with the new data fields; these new fields allow users to filter (or query) molecular interactions with novel criteria that adjust the result to their needs and reduce post-processing steps. The retrieval of information that was previously unavailable has thus been enabled. For example, in the new PSI-MITAB 2.6, the ‘complex expansion’ field is introduced. Thanks to this field, and with the new PSICQUIC service, the user now is able to distinguish the results that come from a original binary interaction or a binary pair resulting from the expansion of a n-ary interaction with one of the expansion methods available (spoke expansion, bipartite expansion or matrix expansion); distinguishing this information was previously impossible. Adding the ‘stoichiometry’ field (included in PSI-MITAB 2.7) allows the retrieval of information about intra-molecular interactions and with the inclusion of the ‘features’ field, PSICQUIC is able to provide, for the first time, fully compliant MIMIx information. In the updated PSI-MITAB formats, some information has been reallocated to the new columns. This removed some previously existing inconsistencies and made the records more accurate and easier to access through a MIQL query. See PSICQUIC extension for MIQL (http://code.google.com/p/psicquic/wiki/MiqlReference27) for a detailed description of the additional fields.

Reference implementation

The open-source reference implementation described in this article has been wholly rewritten independently from the original (3) but remains backwards compatible with the previous versions of all the protocols. This new PSICQUIC service is based on Apache Solr indexing software (http://lucene.apache.org/solr/), which is a web application built on top of Apache Lucene technology. From the data provider perspective, the new open-source reference implementation of PSICQUIC allows a local PSICQUIC server to be easily set up and loaded with data provided as a valid PSI-MITAB file. It supports the original 15-column PSI-MITAB 2.5 as well as newer, PSI-MITAB 2.6 and PSI-MITAB 2.7 formats, with 36 and 42 columns respectively (see Table 3).
Table 3.

Evolution of PSI-MITAB format

PSI-MITAB 2.5 (15 cols)PSI-MITAB 2.6 (+21 cols)PSI-MITAB 2.7 (+6 cols)
ID(s) interactor A & BExperimental role(s) A &BFeatures A & B
Alt. ID(s) interactor A & BBiological role(s) A & BStoichiometry A & B
Alias(es) interactor A & BProperties (CrossReference) A & BParticipant detection method A & B
Interaction detection method(s)Type(s) of interactors A & B
Publication 1st author(s)Host organism
Publication Identifier(s)Expansion method(s)
Taxid interactor A & BAnnotations A & B
Interaction type(s)Parameters
Source database(s)Creation/update date
Interaction identifier(s)Checksums A, B & interaction
Confidence value(s)Negative
Evolution of PSI-MITAB format In addition to introducing support for the new PSI-MITAB versions and the MIQL extension, extensive restructuring of the code resulted in improved response time of the server. It also removed the restriction on the number of interactions that can be exported in the XGMML format used in Cytoscape (9,10), which previously existed in the REST protocol (sending small chunks of interactions until the file is completely transmitted instead of truncating it as it was before). All these improvements enhance the web service and the concurrent search of multiple molecular-interaction databases independently of the different clients.

Server deployment

The reference implementation source code can be downloaded from the PSICQUIC Google project repository (svn co ). It includes a JAVA class to create the index from the PSI-MITAB file and a script that can be used to easily start the indexing process (bash indexMitab.sh /path/to/mitab-file solr-index-directory). The solr-index directory will contain the index, solr configuration files, solr schema and the solr.war file mandatory to run the solr application. More detailed information and other options to deploy a PSICQUIC server is available on the PSICQUIC website (https://code.google.com/p/psicquic/wiki/HowToInstallPsicquicSolr). In Figure 2 different elements required to build a PSICQUIC service from an interaction database are shown for clarification of this process. Solr indexing will enable the development of facilities such as visualization of statistical data through faceting, indexing from PSI-XML and data sorting. In addition to using the default implementation presented, providers can also implement their own systems to publish interactions as long as they meet the PSICQUIC specifications (http://code.google.com/p/psicquic/wiki/PsicquicSpecification).
Figure 2.

Dataflow followed by the reference implementation from its origin in the molecular-interaction databases to the end user through PSICQUIC.

Dataflow followed by the reference implementation from its origin in the molecular-interaction databases to the end user through PSICQUIC.

PSICQUIC clients

In addition to using the services directly from the browser (in the case of REST) or, alternatively, create a custom client, there are several applications at users’ disposal for querying the web services programmatically. The PSICQUIC project site (http://psicquic.googlecode.com) offers open-source libraries for working with the different standards, JAVA clients to access the web services (http://code.google.com/p/psicquic/wiki/JavaClient), code examples for accessing from Perl (http://code.google.com/p/psicquic/wiki/PerlCodeSamples) and other scripts in Python (http://code.google.com/p/psicquic/wiki/PythonCodeSamples) and help for broad use cases. Important clients include the molecular interactions cluster (http://code.google.com/p/micluster), the PSICQUIC Client Plugin for Cytoscape (http://apps.cytoscape.org/apps/psicquicuniversalclient) or the PSICQUIC View (http://www.ebi.ac.uk/Tools/webservices/psicquic/view/main.xhtml). See Figure 3.
Figure 3.

PSICQUIC View is a client for PSICQUIC services in which by formulating only one query fetches all the relevant molecular interactions available in the registered services. After the search, the user can choose from studying the results in more detail, viewing the interaction network, downloading or clustering the results.

PSICQUIC View is a client for PSICQUIC services in which by formulating only one query fetches all the relevant molecular interactions available in the registered services. After the search, the user can choose from studying the results in more detail, viewing the interaction network, downloading or clustering the results.

PSICQUIC registry

Users are expected to obtain the PSICQUIC web service SOAP or REST URLs by means of querying the PSICQUIC Registry. In addition to providing the necessary URLs, the registry is itself a REST web service, offering data on the number of interactions per service, the status of each service, a statement as to whether the data are restricted or not, the version of the software used and a small description of the type of service given by means of tags. The PSICQUIC registry is currently hosted at the European Bioinformatics Institute (http://www.ebi.ac.uk/Tools/webservices/psicquic/registry/registry?action=STATUS). Figure 4 explains how to retrieve the information from the registry through HTTP. More information about the PSICQUIC registry is available at the PSICUIC project site (http://code.google.com/p/psicquic/wiki/Registry).
Figure 4.

Structure of the URL to fetch data from PSICQUIC registry.

Structure of the URL to fetch data from PSICQUIC registry.

DISCUSSION

In <5 years since its original implementation, PSICQUIC has grown from 4 to 28 providers supplying >150 million interactions, with additional services preparing to join. With the new reference implementation, we open the door to the additional new features such as sorting by different criteria (for example, the confidence score of the interactions) or faceting to retrieve statistics. Longer-term plans include direct indexing of the PSI-XML data to allow processing of the molecular-interaction data described in the original PSI-XML files, thus avoiding the currently necessary, lossy conversions between PSI-XML and PSI-MITAB formats. This, in turn, will enable the querying and retrieval of n-ary interactions rather than only binary pairs.

FUNDING

European Commission grant PSIMEx [FP7-HEALTH-2007-223411]; National Institutes of Health [R01GM071909 to L.S.]; National Heart, Lung, and Blood Institute Proteomics Center Award [HHSN268201000035C to R.J]; Genome BC through the Pathogenomics of Innate Immunity (PI2) project; Foundation for the National Institutes of Health and the Canadian Institutes of Health Research under the Grand Challenges in Global Health Research Initiative [Grand Challenges ID: 419 to K.B.]; AllerGen [12ASI1;12B&B2] (to K.B.). Funding for open access charge: European Commission [FP7-HEALTH-2007-223411]. Conflict of interest statement. None declared.
  10 in total

1.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

2.  Submit your interaction data the IMEx way: a step by step guide to trouble-free deposition.

Authors:  Sandra Orchard; Samuel Kerrien; Philip Jones; Arnaud Ceol; Andrew Chatr-Aryamontri; Lukasz Salwinski; Jason Nerothin; Henning Hermjakob
Journal:  Proteomics       Date:  2007-09       Impact factor: 3.984

Review 3.  The minimum information required for reporting a molecular interaction experiment (MIMIx).

Authors:  Sandra Orchard; Lukasz Salwinski; Samuel Kerrien; Luisa Montecchi-Palazzi; Matthias Oesterheld; Volker Stümpflen; Arnaud Ceol; Andrew Chatr-aryamontri; John Armstrong; Peter Woollard; John J Salama; Susan Moore; Jérôme Wojcik; Gary D Bader; Marc Vidal; Michael E Cusick; Mark Gerstein; Anne-Claude Gavin; Giulio Superti-Furga; Jack Greenblatt; Joel Bader; Peter Uetz; Mike Tyers; Pierre Legrain; Stan Fields; Nicola Mulder; Michael Gilson; Michael Niepmann; Lyle Burgoon; Javier De Las Rivas; Carlos Prieto; Victoria M Perreau; Chris Hogue; Hans-Werner Mewes; Rolf Apweiler; Ioannis Xenarios; David Eisenberg; Gianni Cesareni; Henning Hermjakob
Journal:  Nat Biotechnol       Date:  2007-08       Impact factor: 54.908

Review 4.  Molecular interaction databases.

Authors:  Sandra Orchard
Journal:  Proteomics       Date:  2012-05       Impact factor: 3.984

5.  PSICQUIC and PSISCORE: accessing and scoring molecular interactions.

Authors:  Bruno Aranda; Hagen Blankenburg; Samuel Kerrien; Fiona S L Brinkman; Arnaud Ceol; Emilie Chautard; Jose M Dana; Javier De Las Rivas; Marine Dumousseau; Eugenia Galeota; Anna Gaulton; Johannes Goll; Robert E W Hancock; Ruth Isserlin; Rafael C Jimenez; Jules Kerssemakers; Jyoti Khadake; David J Lynn; Magali Michaut; Gavin O'Kelly; Keiichiro Ono; Sandra Orchard; Carlos Prieto; Sabry Razick; Olga Rigina; Lukasz Salwinski; Milan Simonovic; Sameer Velankar; Andrew Winter; Guanming Wu; Gary D Bader; Gianni Cesareni; Ian M Donaldson; David Eisenberg; Gerard J Kleywegt; John Overington; Sylvie Ricard-Blum; Mike Tyers; Mario Albrecht; Henning Hermjakob
Journal:  Nat Methods       Date:  2011-06-29       Impact factor: 28.547

6.  Protein interaction data curation: the International Molecular Exchange (IMEx) consortium.

Authors:  Sandra Orchard; Samuel Kerrien; Sara Abbani; Bruno Aranda; Jignesh Bhate; Shelby Bidwell; Alan Bridge; Leonardo Briganti; Fiona S L Brinkman; Fiona Brinkman; Gianni Cesareni; Andrew Chatr-aryamontri; Emilie Chautard; Carol Chen; Marine Dumousseau; Johannes Goll; Robert E W Hancock; Robert Hancock; Linda I Hannick; Igor Jurisica; Jyoti Khadake; David J Lynn; Usha Mahadevan; Livia Perfetto; Arathi Raghunath; Sylvie Ricard-Blum; Bernd Roechert; Lukasz Salwinski; Volker Stümpflen; Mike Tyers; Peter Uetz; Ioannis Xenarios; Henning Hermjakob
Journal:  Nat Methods       Date:  2012-04       Impact factor: 28.547

7.  The HUPO PSI's molecular interaction format--a community standard for the representation of protein interaction data.

Authors:  Henning Hermjakob; Luisa Montecchi-Palazzi; Gary Bader; Jérôme Wojcik; Lukasz Salwinski; Arnaud Ceol; Susan Moore; Sandra Orchard; Ugis Sarkans; Christian von Mering; Bernd Roechert; Sylvain Poux; Eva Jung; Henning Mersch; Paul Kersey; Michael Lappe; Yixue Li; Rong Zeng; Debashis Rana; Macha Nikolski; Holger Husi; Christine Brun; K Shanker; Seth G N Grant; Chris Sander; Peer Bork; Weimin Zhu; Akhilesh Pandey; Alvis Brazma; Bernard Jacq; Marc Vidal; David Sherman; Pierre Legrain; Gianni Cesareni; Ioannis Xenarios; David Eisenberg; Boris Steipe; Chris Hogue; Rolf Apweiler
Journal:  Nat Biotechnol       Date:  2004-02       Impact factor: 54.908

8.  Cytoscape 2.8: new features for data integration and network visualization.

Authors:  Michael E Smoot; Keiichiro Ono; Johannes Ruscheinski; Peng-Liang Wang; Trey Ideker
Journal:  Bioinformatics       Date:  2010-12-12       Impact factor: 6.937

9.  The HUPO Proteomics Standards Initiative Meeting: Towards Common Standards for Exchanging Proteomics Data.

Authors:  Sandra Orchard; Paul Kersey; Henning Hermjakob; Rolf Apweiler
Journal:  Comp Funct Genomics       Date:  2003

10.  Broadening the horizon--level 2.5 of the HUPO-PSI format for molecular interactions.

Authors:  Samuel Kerrien; Sandra Orchard; Luisa Montecchi-Palazzi; Bruno Aranda; Antony F Quinn; Nisha Vinod; Gary D Bader; Ioannis Xenarios; Jérôme Wojcik; David Sherman; Mike Tyers; John J Salama; Susan Moore; Arnaud Ceol; Andrew Chatr-Aryamontri; Matthias Oesterheld; Volker Stümpflen; Lukasz Salwinski; Jason Nerothin; Ethan Cerami; Michael E Cusick; Marc Vidal; Michael Gilson; John Armstrong; Peter Woollard; Christopher Hogue; David Eisenberg; Gianni Cesareni; Rolf Apweiler; Henning Hermjakob
Journal:  BMC Biol       Date:  2007-10-09       Impact factor: 7.431

  10 in total
  46 in total

Review 1.  Glycosaminoglycanomics: where we are.

Authors:  Sylvie Ricard-Blum; Frédérique Lisacek
Journal:  Glycoconj J       Date:  2016-11-30       Impact factor: 2.916

2.  MIPPIE: the mouse integrated protein-protein interaction reference.

Authors:  Gregorio Alanis-Lobato; Jannik S Möllmann; Martin H Schaefer; Miguel A Andrade-Navarro
Journal:  Database (Oxford)       Date:  2020-01-01       Impact factor: 3.451

3.  TCTEX1D4 interactome in human testis: unraveling the function of dynein light chain in spermatozoa.

Authors:  Maria João Freitas; Luís Korrodi-Gregório; Filipa Morais-Santos; Edgar da Cruz e Silva; Margarida Fardilha
Journal:  OMICS       Date:  2014-03-07

4.  Investigating genetic-and-epigenetic networks, and the cellular mechanisms occurring in Epstein-Barr virus-infected human B lymphocytes via big data mining and genome-wide two-sided NGS data identification.

Authors:  Cheng-Wei Li; Bo-Ren Jheng; Bor-Sen Chen
Journal:  PLoS One       Date:  2018-08-22       Impact factor: 3.240

Review 5.  Spatial and temporal dynamics of the cardiac mitochondrial proteome.

Authors:  Edward Lau; Derrick Huang; Quan Cao; T Umut Dincer; Caitie M Black; Amanda J Lin; Jessica M Lee; Ding Wang; David A Liem; Maggie P Y Lam; Peipei Ping
Journal:  Expert Rev Proteomics       Date:  2015-03-09       Impact factor: 3.940

6.  IHP-PING-generating integrated human protein-protein interaction networks on-the-fly.

Authors:  Gaston K Mazandu; Christopher Hooper; Kenneth Opap; Funmilayo Makinde; Victoria Nembaware; Nicholas E Thomford; Emile R Chimusa; Ambroise Wonkam; Nicola J Mulder
Journal:  Brief Bioinform       Date:  2021-07-20       Impact factor: 11.622

7.  Use of the BioGRID Database for Analysis of Yeast Protein and Genetic Interactions.

Authors:  Rose Oughtred; Andrew Chatr-aryamontri; Bobby-Joe Breitkreutz; Christie S Chang; Jennifer M Rust; Chandra L Theesfeld; Sven Heinicke; Ashton Breitkreutz; Daici Chen; Jodi Hirschman; Nadine Kolas; Michael S Livstone; Julie Nixon; Lara O'Donnell; Lindsay Ramage; Andrew Winter; Teresa Reguly; Adnane Sellam; Chris Stark; Lorrie Boucher; Kara Dolinski; Mike Tyers
Journal:  Cold Spring Harb Protoc       Date:  2016-01-04

8.  CausalTAB: the PSI-MITAB 2.8 updated format for signalling data representation and dissemination.

Authors:  L Perfetto; M L Acencio; G Bradley; G Cesareni; N Del Toro; D Fazekas; H Hermjakob; T Korcsmaros; M Kuiper; A Lægreid; P Lo Surdo; R C Lovering; S Orchard; P Porras; P D Thomas; V Touré; J Zobolas; L Licata
Journal:  Bioinformatics       Date:  2019-10-01       Impact factor: 6.937

Review 9.  Databases for Protein-Protein Interactions.

Authors:  Natsu Nakajima; Tatsuya Akutsu; Ryuichiro Nakato
Journal:  Methods Mol Biol       Date:  2021

10.  Computational Phosphorylation Network Reconstruction: An Update on Methods and Resources.

Authors:  Min Zhang; Guangyou Duan
Journal:  Methods Mol Biol       Date:  2021
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.