| Literature DB >> 17485473 |
José M Fernández1, Robert Hoffmann, Alfonso Valencia.
Abstract
iHOP provides fast, accurate, comprehensive, and up-to-date summary information on more than 80,000 biological molecules by automatically extracting key sentences from millions of PubMed documents. Its intuitive user interface and navigation scheme have made iHOP extremely successful among biologists, counting more than 500,000 visits per month (iHOP access statistics: http://www.ihop-net.org/UniPub/iHOP/info/logs/). Here we describe a public programmatic API that enables the integration of main iHOP functionalities in bioinformatic programs and workflows.Entities:
Mesh:
Year: 2007 PMID: 17485473 PMCID: PMC1933131 DOI: 10.1093/nar/gkm298
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Brief description of web services API models, used on iHOP web services
| Web services API models | Description |
|---|---|
| REST | The Representational State Transfer paradigm is based on the HTTP protocol. It is an improvement over CGI sub-protocol, used by web browsers to send the data in the HTML formularies. The main difference is on the provided results: a REST service is usually restricted to answer with an XML document with all the information; a CGI service can return any document type (HTML, GIF, PDF, etc.). There are also differences between CGI and REST is the way to send the queries: a key-value model is used to send the query data to the CGI service; REST implementations send an XML document with the whole information. Some implementations following REST paradigm (e.g. DAS and iHOP CGI-XML), follow the CGI sub-protocol for the queries, and answer an XML document as the result. |
| SOAP+WSDL | The Simple Object Access Protocol web services paradigm tries to isolate the used communications protocol from the message representation by using XML formatted messages and different message exchange patterns. SOAP messages have an envelope and a body, so complex features and message exchange patterns can be implemented, nevertheless the used communications protocol. Although there are SOAP client and server libraries for FTP, SMTP, POP3 and other protocols, the most used communication protocol is HTTP. Web Services Description Language is a companion technology of SOAP services. WSDL documents are usually used to describe SOAP services: their inputs and outputs, their XML data types, the data encoding and the message patterns to use. Although any SOAP service can be used without the aid of a WSDL document which describes the service, they are a standard way to distribute that information. |
| BioMOBY | BioMOBY web services architecture has three main roles: MOBY Central, MOBY clients and MOBY services. MOBY Central is the repository of the ontologies used by the architecture: namespace ontology, which is used to label biological references so they can be unambiguously identified; object ontology, which defines the object types which can be consumed and produced by the services; service type ontology, which contains the classification of service types used to label the different services; and the service ontology, which contains the description of all the registered MOBY services. One of the features which distinguishes BioMOBY architecture from SOAP Protocol is that many queries can be clustered on a single message when they are sent to a service. Although BioMOBY architecture has its own message formats and query protocols, SOAP infrastructure is used to wrap MOBY messages. BioMOBY community is now discussing the need to drop SOAP (due its overhead) in favour of lighter REST paradigm. |
Figure 1.Schema of operations of iHOP web services. Each box is a web service, and double boxes are the recommended starting points for workflows. Links show some suggested flows between services that are useful for workflow building. Green links represent bi-directional flow, and black arrowed blue lines mean uni-directional flow. Grey thick lines illustrate bidirectional external access and orange dashed lines are accesses to the iHOP core (http://www.ihop-net.org/).
Figure 2.This is a Taverna workflow diagram which takes as input some free text (e.g. ‘breast cancer’). The workflow fetches the gene and protein symbols related to the input free text, and it returns those symbols, their synonyms and all the abstracts with sentences where the symbols are showing interaction evidences with other protein or gene symbols.
Logical iHOP web services functionalities. These functionalities and the web services which implement them are focused on automation, so almost all functionalities have more than one input type. So, depending on the API model, some of these functionalities have been implemented more than once, based on each one of the possible input types.
| Functionality | Inputs | Results |
|---|---|---|
| Free text (e.g. P53, breast cancer or BRCA2), and an optional NCBI TaxID. | A list of the possible iHOP identified gene or protein symbols, related to the input. | |
| A biological database reference (e.g. UniProt accession, NCBI GENE). | The iHOP identified gene or protein symbol related to the input. | |
| Free text, and an optional NCBI TaxID. | The iHOP identified gene or protein symbol related to the input, chosen by naïve heuristics. | |
| A biological database reference | The iHOP identified gene or protein symbol related to the input. | |
| Free text, and an optional NCBI TaxID,or an iHOP gene or protein symbol ID,or a biological database reference. | The information available at the iHOP server about the iHOP identified gene or protein related to the input (name, organism, database references, synonyms, etc.). When free text is used, naïve heuristics have been used to choose the iHOP identified gene. | |
| Free text, and an optional NCBI TaxID,or an iHOP gene or protein symbol ID,or a biological database reference. | The abstract sentences available at the iHOP server which uniquely define the iHOP identified gene or protein related to the input, along with their score, iHOP abstract ID, journal impact, etc. When free text is used, naïve heuristics have been used to choose the iHOP symbol. | |
| Free text, and an optional NCBI TaxID,or an iHOP gene or protein symbol ID,or a biological database reference. | The abstract sentences available at the iHOP server which show evidences about interactions between the iHOP identified gene or protein symbol related to the input and other iHOP symbols, along with their score, iHOP abstract ID, journal impact, etc. When free text is used, naïve heuristics have been used to choose the iHOP symbol. | |
| A PubMed PMID,or and iHOP abstract ID. | iHOP analysed and enriched PubMed abstract associated to the input. All the abstract sentences are annotated, focusing on remarkable sentence elements (verbs, nouns, adjectives, gene or protein symbols, etc.). |
Projects where iHOP web services have been (or are being) used
| Project | Description | URL/Reference |
|---|---|---|
| ORIEL | The CGI-XML API was developed in the context of this project, and it was integrated with other biological information resources. | |
| DIAMONDS | The iHOP SOAP interface was developed and applied to the dynamic extraction of proteins related with cell cycle in various genomes with special emphasis in Arabidopsis proteins. This information was used as input for the modelling approaches developed by the partners of this project. | |
| ECID | These projects and their tools (e.g. ENFIN Spindle proteins tool, CARGO framework) are currently using these iHOP web service APIs in different biological and technical contexts. | |
| COMBIO | ||
| ENFIN (Enabling Systems Biology) Network of Excellence | ||
| CARGO | ||
| INB | BioMOBY iHOP web services were funded by the Spanish National Bioinformatics Institute (INB), and published on the INB specific MOBY repository, integrated in the INB curated bioinformatics object ontology. The services will also be made available in the central MOBY repository of Canada. |
Figure 3.This is a snapshot of the CARGO framework, showing information about P53. The iHOP widget shows sentences with evidences of some relationship between P53 and other genes.