Literature DB >> 20501602

WIWS: a protein structure bioinformatics Web service collection.

M L Hekkelman1, T A H Te Beek, S R Pettifer, D Thorne, T K Attwood, G Vriend.   

Abstract

The WHAT IF molecular-modelling and drug design program is widely distributed in the world of protein structure bioinformatics. Although originally designed as an interactive application, its highly modular design and inbuilt control language have recently enabled its deployment as a collection of programmatically accessible web services. We report here a collection of WHAT IF-based protein structure bioinformatics web services: these relate to structure quality, the use of symmetry in crystal structures, structure correction and optimization, adding hydrogens and optimizing hydrogen bonds and a series of geometric calculations. The freely accessible web services are based on the industry standard WS-I profile and the EMBRACE technical guidelines, and are available via both REST and SOAP paradigms. The web services run on a dedicated computational cluster; their function and availability is monitored daily.

Entities:  

Mesh:

Year:  2010        PMID: 20501602      PMCID: PMC2896166          DOI: 10.1093/nar/gkq453

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Macromolecular structures are at the basis of much research in fields as diverse as drug design, bio-fuel engineering, molecular biology or force field design. The number of new protein structures deposited in the PDB has remained stable for several years at around 20 per day. The statistics on web servers listed in the special volumes of NAR collected by Brazas et al. (1) clearly show that protein structures remain an important topic, as the number of web servers and visualization tools relating to protein structures continues to grow steadily. WHAT IF (2) is a widely distributed interactive software package for macromolecular analysis, visualization, modelling and structure validation. Originally an interactive application driven either via a Graphical User Interface or from the command line, a decade ago we exposed the first three components of WHAT IF via human-accessible web interfaces (3). Since then, the collection of functions accessible in this way has grown to around 80, which are being used by a few hundred users each day. More recently, the software has been successfully integrated with the COOT crystallographic software (4) and the YASARA (http://www.yasara.org/) molecular modelling, simulation and drug design software (5). Encouraged by initiatives such as BioSapiens (http://www.biosapiens.info/) and EMBRACE (http://www.embracegrid.info/), which have championed the use of web services in the life sciences, we have now developed and deployed a freely available web service version of WHAT IF. Unlike the more traditional web server (e.g. HTML/CGI) interfaces that have become a popular way of providing remote human access to tools and resources, web services are straightforwardly accessible from within computer programs, allowing them to be used for large-scale ‘batch processing’ tasks that would be inconceivable via manual cut-and-paste alternatives. Acting as a subroutine or function that resides on a remote computer, web services provide a number of benefits to bioinformatics software developers (and thence directly or indirectly also to life scientist users of their tools): A more extensive description of the advantages and disadvantages of the use of web services can be found in the article describing the EMBRACE web service registry elsewhere in this volume. The first WHAT IF-based web servers were launched in 1997 and were published in 1998 (3) and remain active to this day (http://swift.cmbi.ru.nl/). reduction in the amount of code that is being re-invented and maintained; reduction in the cost of installing, managing and maintaining local installations of software packages; access to the latest version of algorithms and data without the need to upgrade local installations; and access to enormous amounts of (free) CPU time.

METHODS

WHAT IF, a FORTRAN program conceived in 1987 (2), is still actively maintained and developed. Its core of around a million lines of code represent a substantial collection of algorithms and heuristics that remain of significant utility to today’s scientific community. In order to deploy the WHAT IF code as a web service, we designed a special ‘wrapper’ application that acts as an intermediate between the legacy architecture of the command-line-based WHAT IF system and the modern protocols required by web services. This application, called WIWS (‘WHAT IF web services’), listens on the network for incoming web service requests and translates these into WHAT IF commands that are injected into the interactive component of the system as though they had been typed by a user. WHAT IF then executes these commands and produces its own bespoke format output, which in turn is then parsed by WIWS, and translated back to the contemporary protocols (e.g. SOAP and XML) expected by web service software. The WSDL (‘Web Service Description Language’—a formal description of the web service, required by client software in order to access the service) and the documentation for the web services are generated automatically by WIWS and are available at http://wiws.cmbi.ru.nl/wsdl/ and http://wiws.cmbi.ru.nl/help/, respectively. Experienced WHAT IF users can freely obtain from us all software elements needed to set up their own WHAT IF-based web services. WHAT IF has over 1600 different functions that can be invoked using a command line interface, and which in essence form a simple state-based programming language. Rather than expose all of this complex fine-grained level of control as web services, we have instead developed a scripting language that allows more meaningful course-grained web services to be automatically generated by combining WHAT IF’s low-level commands in a coherent manner. The architecture of the system (Figure 1) thus consists of two components: (i) the original WHAT IF core, acting as an engine responsible for executing the underlying algorithms; (ii) the WIWS system: this contains (a) an interpreter and scripts containing subroutines that map between sequences of low-level WHAT IF commands and meaningful web service-level operations, and (b) a bespoke HTTP server that responds to incoming web service requests and invokes the high-level functions represented by individual subroutines.
Figure 1.

Architecture of the WIWS system. The original WHAT IF system at the far left is augmented with a software layer that receives streams of automated commands, and sends them to WHAT IF as though they had been typed at the command line. The WIWS server (centre) receives Web service invocations from users in either REST or SOAP format, and matches these with one of the scripted subroutines, which then calls WHAT IF in order to execute the appropriate sequence of instructions.

Architecture of the WIWS system. The original WHAT IF system at the far left is augmented with a software layer that receives streams of automated commands, and sends them to WHAT IF as though they had been typed at the command line. The WIWS server (centre) receives Web service invocations from users in either REST or SOAP format, and matches these with one of the scripted subroutines, which then calls WHAT IF in order to execute the appropriate sequence of instructions. When the server is initialized, it parses the latest version of its script and automatically generates and advertises the WSDL descriptions of the subroutines described therein, as well as creating corresponding SOAP and REST endpoints. The server is written in C++. The script language is based on the Oberon-2 programming language and implements most of its syntax. The XML and HTTP handling code comes from the open source libzeep project (http://libzeep.berlios.de/).

RESULTS

To date, we have produced 84 web servers and 64 web services, all of which take PDB format molecular structure data as input. The web servers mainly deal with options that produce two- and three-dimensional graphics and other WHAT IF options that produce human readable output. The web services focus on WHAT IF options that would be useful for third-party protein structure-related software tools. The web service operations broadly fall into one of two groups. In the first group are operations that categorize or otherwise annotate residues or atoms, for example to: In the second group are operations that generate, or otherwise modify, the input’s molecular structure, for example to: A representative selection of these services is shown in Table 1.
Table 1.

Selected WHAT IF web services (of the 64 that are available to date)

Visualization optionsWhat the web service returns:
GetSurfaceDot(Many) points in space that visualize the surface.
Geometric calculations
    AtomAccessibilitySolventSolvent-accessible surface area for each atom in Ångström2.
    ResidueAccessibilityVacuumSolvent-accessible surface area in Ångström2, for the residue in vacuum, only including the backbone of its direct neighbours.
    ResidueTorsionsFor each residue, its torsion angles: φ, ψ, ω, χ1–χ5 (in degrees).
    ShowTauAngleFor each amino acid, its backbone τ-angle (in degrees).
    CysteineTorsionsCys-cys bridge torsions over: Cαi-Cβii-Sγii-Sγjj-Cβjj-Cαj
Atomic contacts
    ShowBumpsAtom pairs with a clash worse than 0.25 Å.
    ShowSaltBridgesPairs of charged groups within 8.0 Å.
    ShowHydrogenBondsH-bonded atoms: four geometric parameters describe the bond.
    ShowLigandContactsLists macromolecule-ligand contacting atom pairs.
    ShowCysteineMetalLists cysteines that are bound to a metal.
    HasNucleicContactsLists protein residues that contact a nucleic acid.
Structure quality
    PackingQualityPacking normality for amino acids.
    ImproperQualityMaxPer residue, its worst improper dihedral Z-score.
Protein engineering
    ShowLikelyRotamersShow rotamer likelihoods for rotamers for all amino acid types.
    ProlineMutationValueLikelihood at each position that an introduced proline will be thermostabilizing.
Returning coordinates
    PDBasXMLThe input PDB file is returned in XML.
    CorrectedPDBasXMLThe input PDB gets 22 correction steps and is returned as XML.
    SymShellFiveXMLShell of symmetry-related residues, 5.0 Å thick, as XML.

The full list and all descriptions are available at http://wiws.cmbi.ru.nl/help/. The algorithms underlying the web services are explained at the WHAT IF web site (http://swift.cmbi.ru.nl/whatif/).

provide simple geometric parameters, such as accessibilities, torsion angles, etc.; list observables, such as disulphide bridges, salt bridges, etc.; provide tables of per-residue structure validation scores. correct and or complete coordinates; add and optimize protons; and add symmetry-related residues/molecules. Selected WHAT IF web services (of the 64 that are available to date) The full list and all descriptions are available at http://wiws.cmbi.ru.nl/help/. The algorithms underlying the web services are explained at the WHAT IF web site (http://swift.cmbi.ru.nl/whatif/). All the WHAT IF web services are regularly tested by a script that exercises them against 10 valid PDB files that have been selected for their complexity (from a WHAT IF point of view). The services have also been entered into the EMBRACE registry (6), which regularly monitors and reports on their availability. Usage of the web services is free and essentially unlimited, but we request that users submit their web service calls sequentially. As a case study, 10 of the WHAT IF web service operations have been integrated with the Utopia (7,8) visualization suite. Utopia has a flexible plug-in system that allows its graphical visualization components to be linked easily with online tools and databases. Relevant fragments of the Python code necessary to use the WHAT IF web services in Utopia are listed at the WHAT IF (http://swift.cmbi.ru.nl/whatif/) web service documentation page (http://swift.cmbi.ru.nl/Webservices/). Utopia is freely available from http://www.getutopia.com/. Figures 2 and 3 show Utopia screenshots that illustrate the utility of the web service-based virtual linkup of the two software packages. The integrated perspectives shown in the Figures provide at-a-glance overviews of biophysical characteristics both of the molecular structure and of individual amino acids, and how these interrelate: this facilitates, for example, analysis of crystal-packing contact residues relative to core secondary structures (e.g. contact residues 26–36 in Figure 2 lie on the long projecting finger-like loop in the centre of the 3D structure); identification of the locations and relative strength of salt-bridging residues, again in the context of the secondary structures in which they are located; and so on (Figures 3 and 4).
Figure 2.

Left, a representation PDB file 3EBX (9,10). The all-atom dot surface is calculated by WHAT IF, and is colour coded by atom type. The backbone ribbon, generated by Utopia, is coloured according to residue physicochemical characteristics: green, polar neutral; red, polar acidic; blue, polar basic; white, hydrophobic aliphatic; purple, hydrophobic aromatic; brown, special structural (proline, glycine); yellow, sulphur-containing, structural (cysteine).

Figure 3.

Here, the horizontal tracks show in order (i) the amino acid residue number in the sequence, (ii) the sequence itself (colour code as above), (iii) residues with crystal-packing contacts (indicated by red triangles); (iv) the secondary structure of 3EBX determined by DSSP (11), filtered by WHAT IF and displayed by Utopia—here, red arrows denote β-strands; (v) vacuum accessibility in square Ångströms.

Figure 4.

PDB file 1A08 (12) displayed using Utopia, having invoked 4 WHAT IF services. Tracks show in order: (i) amino acid residue number, (ii) the residue sequence (colour code as in Figure 3); (iii) residues involved in salt bridges (height of the bar indicates the total enthalpy contribution of all salt bridges in which the residue is involved); (iv) secondary structure (blue zigzags indicate α-helices, red arrows denote β-strands); (v) vacuum accessibility displayed using an alternative mode of Utopia visualization relative to that in Figure 3 (the values at 25, 28, 40 and 52 are missing because their side chains are not complete in the PDB file); (vi) red triangles indicate residues involved in crystal-packing contacts.

Left, a representation PDB file 3EBX (9,10). The all-atom dot surface is calculated by WHAT IF, and is colour coded by atom type. The backbone ribbon, generated by Utopia, is coloured according to residue physicochemical characteristics: green, polar neutral; red, polar acidic; blue, polar basic; white, hydrophobic aliphatic; purple, hydrophobic aromatic; brown, special structural (proline, glycine); yellow, sulphur-containing, structural (cysteine). Here, the horizontal tracks show in order (i) the amino acid residue number in the sequence, (ii) the sequence itself (colour code as above), (iii) residues with crystal-packing contacts (indicated by red triangles); (iv) the secondary structure of 3EBX determined by DSSP (11), filtered by WHAT IF and displayed by Utopia—here, red arrows denote β-strands; (v) vacuum accessibility in square Ångströms. PDB file 1A08 (12) displayed using Utopia, having invoked 4 WHAT IF services. Tracks show in order: (i) amino acid residue number, (ii) the residue sequence (colour code as in Figure 3); (iii) residues involved in salt bridges (height of the bar indicates the total enthalpy contribution of all salt bridges in which the residue is involved); (iv) secondary structure (blue zigzags indicate α-helices, red arrows denote β-strands); (v) vacuum accessibility displayed using an alternative mode of Utopia visualization relative to that in Figure 3 (the values at 25, 28, 40 and 52 are missing because their side chains are not complete in the PDB file); (vi) red triangles indicate residues involved in crystal-packing contacts. The use of the WHAT IF web services is rather straightforward. The WSDL will suffice for most experienced programmers, so we will list here only one small example, mainly to illustrate the ease of use of the WHAT IF web services. The smallest Python program we could think of that actually does something is listed below. The first line in this four-line program points at the Python executable. The second line imports the suds Python module (which you can download from http://python-suds.sourceforge.net/). The third line tells the program where the WHAT IF web services WSDL is located and the fourth line tells these services to print the results of the ShowBumps web service for the PDB file with identifier 1crn. #!/usr/bin/python from suds import client c = client.Client(‘http://wiws.cmbi.ru.nl/wsdl’) print c.service.ShowBumps(‘1crn')

DISCUSSION

We have produced 64 web services that perform a wide variety of macromolecular structure-related tasks. While selecting and designing these web services, we especially kept in mind the many programmers who provide macromolecular software and/or web services for niche research areas and who thus could benefit most from easy access to some of WHAT IF’s more advanced features, such as protein structure validation, structure correction and hydrogen bond- and symmetry application-related options. The automatic generation of web services via our scripting language means that deploying new services is a relatively trivial task, and we welcome and encourage requests from the community for new services.

FUNDING

NBIC, EMBRACE and UK BBSRC (grant number BBE0160651); EMBRACE project is funded by the European Commission within its FP6 Programme, under the thematic area ‘Life sciences, genomics and biotechnology for health’ (contract number LHSG-CT-2004-512092). Funding for open access charge: CMBI departmental money. Conflict of interest statement. None declared.
  12 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Increasing the precision of comparative models with YASARA NOVA--a self-parameterizing force field.

Authors:  Elmar Krieger; Günther Koraimann; Gert Vriend
Journal:  Proteins       Date:  2002-05-15

3.  WHAT IF: a molecular modeling and drug design program.

Authors:  G Vriend
Journal:  J Mol Graph       Date:  1990-03

4.  Homology modeling, model and software evaluation: three related resources.

Authors:  R Rodriguez; G Chinea; N Lopez; T Pons; G Vriend
Journal:  Bioinformatics       Date:  1998       Impact factor: 6.937

5.  Peptide ligands of pp60(c-src) SH2 domains: a thermodynamic and structural study.

Authors:  P S Charifson; L M Shewchuk; W Rocque; C W Hummel; S R Jordan; C Mohr; G J Pacofsky; M R Peel; M Rodriguez; D D Sternbach; T G Consler
Journal:  Biochemistry       Date:  1997-05-27       Impact factor: 3.162

6.  Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features.

Authors:  W Kabsch; C Sander
Journal:  Biopolymers       Date:  1983-12       Impact factor: 2.505

7.  Evolution in bioinformatic resources: 2009 update on the Bioinformatics Links Directory.

Authors:  Michelle D Brazas; Joseph Tadashi Yamada; B F Francis Ouellette
Journal:  Nucleic Acids Res       Date:  2009-06-15       Impact factor: 16.971

8.  An active registry for bioinformatics web services.

Authors:  S Pettifer; D Thorne; P McDermott; T Attwood; J Baran; J C Bryne; T Hupponen; D Mowbray; G Vriend
Journal:  Bioinformatics       Date:  2009-05-21       Impact factor: 6.937

9.  UTOPIA-User-Friendly Tools for Operating Informatics Applications.

Authors:  S R Pettifer; J R Sinnott; T K Attwood
Journal:  Comp Funct Genomics       Date:  2004

10.  Visualising biological data: a semantic approach to tool and database integration.

Authors:  Steve Pettifer; David Thorne; Philip McDermott; James Marsh; Alice Villéger; Douglas B Kell; Teresa K Attwood
Journal:  BMC Bioinformatics       Date:  2009-06-16       Impact factor: 3.169

View more
  50 in total

1.  High-resolution structure of the recombinant sweet-tasting protein thaumatin I.

Authors:  Tetsuya Masuda; Keisuke Ohta; Bunzo Mikami; Naofumi Kitabatake
Journal:  Acta Crystallogr Sect F Struct Biol Cryst Commun       Date:  2011-05-24

2.  Structural analysis and molecular dynamics simulations of novel δ-endotoxin Cry1Id from Bacillus thuringiensis to pave the way for development of novel fusion proteins against insect pests of crops.

Authors:  Budheswar Dehury; Mousumi Sahu; Jagajjit Sahu; Kishore Sarma; Priyabrata Sen; Mahendra K Modi; Madhumita Barooah; Manabendra Dutta Choudhury
Journal:  J Mol Model       Date:  2013-10-24       Impact factor: 1.810

3.  Removal of a consensus proline is not sufficient to allow tetratricopeptide repeat oligomerization.

Authors:  Amber L Bakkum; R Blake Hill
Journal:  Protein Sci       Date:  2017-07-25       Impact factor: 6.725

4.  Dynamic integration of biological data sources using the data concierge.

Authors:  Peng Gong
Journal:  Health Inf Sci Syst       Date:  2013-02-04

5.  Mechanistic insights from molecular dynamic simulation of Rv0045c esterase in Mycobacterium tuberculosis.

Authors:  Durairaj Sherlin; Sharmila Anishetty
Journal:  J Mol Model       Date:  2015-03-19       Impact factor: 1.810

6.  Dynein light chain 1 (LC8) association enhances microtubule stability and promotes microtubule bundling.

Authors:  Jayant Asthana; Anuradha Kuchibhatla; Swadhin Chandra Jana; Krishanu Ray; Dulal Panda
Journal:  J Biol Chem       Date:  2012-10-04       Impact factor: 5.157

7.  1H-Detected REDOR with Fast Magic-Angle Spinning of a Deuterated Protein.

Authors:  Manali Ghosh; Chad M Rienstra
Journal:  J Phys Chem B       Date:  2017-08-31       Impact factor: 2.991

8.  Lys-Arg mutation improved the thermostability of Bacillus cereus neutral protease through increased residue interactions.

Authors:  Tolbert Osire; Taowei Yang; Meijuan Xu; Xian Zhang; Xu Li; Samuel Niyomukiza; Zhiming Rao
Journal:  World J Microbiol Biotechnol       Date:  2019-10-31       Impact factor: 3.312

9.  Consensus protein engineering on the thermostable histone-like bacterial protein HUs significantly improves stability and DNA binding affinity.

Authors:  Anastasios Georgoulis; Maria Louka; Stratos Mylonas; Philemon Stavros; George Nounesis; Constantinos E Vorgias
Journal:  Extremophiles       Date:  2020-01-24       Impact factor: 2.395

10.  Global regulation of alternative splicing by adenosine deaminase acting on RNA (ADAR).

Authors:  Oz Solomon; Shirley Oren; Michal Safran; Naamit Deshet-Unger; Pinchas Akiva; Jasmine Jacob-Hirsch; Karen Cesarkas; Reut Kabesa; Ninette Amariglio; Ron Unger; Gideon Rechavi; Eran Eyal
Journal:  RNA       Date:  2013-03-08       Impact factor: 4.942

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.