Damiano Piovesan1, Giovanni Minervini1, Silvio C E Tosatto2. 1. Department of Biomedical Sciences, University of Padua, Padua 35121, Italy. 2. Department of Biomedical Sciences, University of Padua, Padua 35121, Italy CNR Institute of Neuroscience, Padua 35121, Italy silvio.tosatto@unipd.it.
Abstract
Residue interaction networks (RINs) are an alternative way of representing protein structures where nodes are residues and arcs physico-chemical interactions. RINs have been extensively and successfully used for analysing mutation effects, protein folding, domain-domain communication and catalytic activity. Here we present RING 2.0, a new version of the RING software for the identification of covalent and non-covalent bonds in protein structures, including π-π stacking and π-cation interactions. RING 2.0 is extremely fast and generates both intra and inter-chain interactions including solvent and ligand atoms. The generated networks are very accurate and reliable thanks to a complex empirical re-parameterization of distance thresholds performed on the entire Protein Data Bank. By default, RING output is generated with optimal parameters but the web server provides an exhaustive interface to customize the calculation. The network can be visualized directly in the browser or in Cytoscape. Alternatively, the RING-Viz script for Pymol allows visualizing the interactions at atomic level in the structure. The web server and RING-Viz, together with an extensive help and tutorial, are available from URL: http://protein.bio.unipd.it/ring.
Residue interaction networks (RINs) are an alternative way of representing protein structures where nodes are residues and arcs physico-chemical interactions. RINs have been extensively and successfully used for analysing mutation effects, protein folding, domain-domain communication and catalytic activity. Here we present RING 2.0, a new version of the RING software for the identification of covalent and non-covalent bonds in protein structures, including π-π stacking and π-cation interactions. RING 2.0 is extremely fast and generates both intra and inter-chain interactions including solvent and ligand atoms. The generated networks are very accurate and reliable thanks to a complex empirical re-parameterization of distance thresholds performed on the entire Protein Data Bank. By default, RING output is generated with optimal parameters but the web server provides an exhaustive interface to customize the calculation. The network can be visualized directly in the browser or in Cytoscape. Alternatively, the RING-Viz script for Pymol allows visualizing the interactions at atomic level in the structure. The web server and RING-Viz, together with an extensive help and tutorial, are available from URL: http://protein.bio.unipd.it/ring.
Non-covalent interactions in proteins have a wide range of different energies and lengths, making them inherently difficult to characterize (1). While the energy contribution of a single interaction is almost negligible, together they determine the three-dimensional protein structure (2). Describing amino acid properties through continuous functions, although highly informative, requires complex calculations and non-trivial analysis. Some effort to extract valuable information through simplification has been done by applying network theory to protein structures (3–5). Residue interaction networks (RINs) consider single amino acids as nodes and physico–chemical interactions, like covalent and non-covalent bonds, as edges. Representing protein structures as RINs has become common practice to explore the complexity inherent in macromolecular systems (6,7). As a consequence, structure analysis has been simplified, allowing to focus only on a subset of relevant residues. According to the concept of ‘residue centrality’ (8), evolutionary conserved (central) residues can be identified by just looking at hyper-connected nodes. RINs have been extensively and successfully used for analysing functional features linked to a broad range of biological processes, the effect of mutations, protein folding, intra protein domain–domain communication and catalytic activity (9–16). On the other hand, software generating RINs still has limitations due to the use of simplified interaction types. RINalyzer (17) for example calculates only hydrogen bond, van der Waals (VDW) and generic contacts based on distance. This limitation can be explained by technical reasons, such as the computational cost of measuring the distance of all possible atom pairs in a protein, in particular for large biopolymers. Another problem is defining distance and angle constraints for certain interactions (e.g. involving π-systems) in large molecules like proteins. For example, the Protein Interaction Calculator (PIC) (18), calculates all types of interactions but lacks atomic resolution for most of them and the distance thresholds are simply based on values reported in the literature which may often be obsolete. The PIC output is provided as separate lists with different formats and hence difficult to import in external network viewers such as Cytoscape (19). The Residue Interaction Network Generator (RING) has been presented to address these limitations (20). RING 2.0 is a new and completely rewritten version of the software based on the Victor library (21). Compared with the previous version, RING-2.0 is available as stand-alone package without the need for third party software, e.g. for the calculation of hydrogen bonds, VDW interactions and secondary structure, increasing the calculation speed by an order of magnitude. RING-2.0 is now also able to return both intra and inter chain interactions as well as contacts involving ‘hetero atoms’ (i.e. ligands, DNA/RNA, cofactors, metal ions and solvent molecules). Moreover, distance thresholds have been optimized to maximize network reliability on a large-scale by analysing the entire Protein Data Bank (PDB) repository (22). The web server is straightforward to use and allows the user to visualize the network directly in the browser and node attributes as interchangeable layers. RING output is compatible with the RINAlyzer (17) and StructureViz (23) plugins for Cytoscape (19). The network can be visualized in the structure at atomic level thanks to the RING-Viz utility for Pymol available on the RING web server.
MATERIALS AND METHODS
RING generates an interaction network in two steps. At first, it identifies a list of residue–residue (residue–ligand or ligand–ligand) pairs eligible for interaction based on all-atom distance measurements. Contacts are then characterized by identifying specific interaction types. Interactions are sorted by position, without repetitions and the index of the source node (first column) being always lower than the target. Multiple interactions occurring between the same residue pair, i.e. involving different atoms are sorted by energy and distance. In this case, the user may choose to receive all interactions, only the first (i.e. most energetic), or only one interaction for each type. Sorting is very helpful for manual inspection of the edges. Algorithm complexity is quadratic depending on the protein size (number of atoms). As a worst case example, for 100 000 atoms distributed over 42 chains (PDB ID: 4V6), the entire computation (ca. 5 billion comparisons) takes 18 min (12 for finding contacts and 6 for sorting/filtering them) on a standard laptop.
Interaction type calculation
Table 1 provides an overview of the RING approach to calculate interaction types. Hydrogen bonds are calculated by applying a donor-hydrogen-acceptor (DHA) angle constraint less or equal to 63° (24), defining a limited set of valid donor/acceptor atoms (25). Generally, only Carbon–Carbon and Carbon–Sulphur pairs are considered valid VDW interactions and evaluated using atom surfaces, i.e. subtracting the atom radius, 1.89 Å for sulphur and 1.77 Å for carbon (26), from the atomic distance. VDW is the only specific interaction type also calculated for ligands, since it is not necessary to know the ligand structure. Special VDW cases involving N and O sidechain atoms of glutamine and asparagine are also considered (27–29). Other types are evaluated based on the distance between pseudo-atoms, i.e. the barycentre of aromatic rings (π-π stacking) or the centre of mass of charged groups (ionic interaction). π–cation interactions are limited to cases where the cation projection over the interacting partner ring lies inside the ring itself.
Table 1.
RING-2.0 interaction types
Network attributes
RING generates attributes for both nodes and edges. Structural features are reported for each node and include secondary structure, vertex degree (the number of directly connected nodes), experimental uncertainty for X-ray structures (i.e. Cα B-factor), conformational energy preferences determined with FRST (30) and TAP (31), conservation (Shannon entropy) and cumulative mutual information (MI) (32), calculated from PSI-BLAST profiles (33). Interaction energies have been derived from the literature, in particular hydrogen bonds have different energies depending on the donor/acceptor pair (34). Edge attributes include the bond angle (except for VDW), energy and involved atoms. The orientation attribute is calculated only for π–π stacking and π–cation interactions and represents the reciprocal orientation of the two interacting rings and the guanidine group (arginine) positioning relative to the plane of the aromatic ring partner respectively (Figure 1). Moreover, when the sequence profile is calculated RING provides MI and APC corrected MI (32).
Figure 1.
Orientation definition in RING. π–π stacking interactions adopt parallel (P), normal (N), lateral (L), tilted edge to face (T-EF) and tilted face to edge (T-FE) orientations. Orientation in π–cation interactions is provided only for contacts involving arginine and describes guanidine plane positioning relative to the partner ring and it is limited to P, L and N conformations.
Orientation definition in RING. π–π stacking interactions adopt parallel (P), normal (N), lateral (L), tilted edge to face (T-EF) and tilted face to edge (T-FE) orientations. Orientation in π–cation interactions is provided only for contacts involving arginine and describes guanidine plane positioning relative to the partner ring and it is limited to P, L and N conformations.
Distance threshold optimization
The RING algorithm calculates atomic interactions based on geometrical criteria, without complicated analysis based on force fields, obtaining a reliable interaction network very rapidly. The quality of the interactions strongly depends on geometrical constraints, in particular the distance threshold parameter. An exhaustive analysis has been performed on the entire PDB. The distance distribution of different interaction types has been calculated for the interaction network of all available X-ray and NMR structures (116 568; April 2016). Two different distance thresholds have been chosen for RING to represent strict and permissive parameters (see Figure 2).
Figure 2.
Distance distribution for the six different interaction types. Hydrogen bonds are split into side chain (SC) and main chain (MC). Ionic interactions are characterized by the positively charged residue. Van der Waals interaction by secondary structure of the pair (E = sheet, H = helix, x = undefined). π–π stacking and π–cation interactions are separated by orientation type (see ‘Materials and Methods’ section). Red and blue vertical lines correspond respectively to the ‘strict’ and ‘relaxed’ thresholds in the web server.
Distance distribution for the six different interaction types. Hydrogen bonds are split into side chain (SC) and main chain (MC). Ionic interactions are characterized by the positively charged residue. Van der Waals interaction by secondary structure of the pair (E = sheet, H = helix, x = undefined). π–π stacking and π–cation interactions are separated by orientation type (see ‘Materials and Methods’ section). Red and blue vertical lines correspond respectively to the ‘strict’ and ‘relaxed’ thresholds in the web server.
Van der Waals
Most VDW interactions involve C-C pairs with a distance of [0.71, 0.74] Å. C-S pairs are mainly found in the [0.19, 0.22] interval (data not shown). The number of contacts under the ideal threshold of 0.5 Å is ∼74 million and 119 million under a more relaxed cutoff of 0.8. Interestingly, a lot of atom pairs (ca. 11 million) clash, and are shown as negative values in the figure. These come from low quality structures. Also interesting, two secondary peaks at [0.74, 0.78] and [1.20, 1.23] Å correspond to false positive VDW interactions that are found inside α-helices and between close strands in β-sheets respectively.
Hydrogen bonds
The hydrogen bond distribution has been split into main chain (MC) and side chain (SC) interactions. The peak at [2.84, 2.87] Å corresponds to interactions that stabilize the packing of different secondary structure elements, i.e. bridges between alpha-helices or turns. A similar peak for the MC bonds at [2.94, 2.98] corresponds to interactions between adjacent strands in β-sheets, whereas the second MC peak [5.01, 5.04] Å identifies bonds in α-helices separated by a turn (four residues). Over 5.6 Å only spurious interactions are identified.
Salt bridges
Salt bridges (ionic interactions) are defined between a positively charged amino acid (Arg, Lys, His) and a negatively charged residue (Asp or Glu). Both arginine and lysine have a weak preference for interacting with glutamic acid, on the contrary, histidine slightly prefers aspartic acid (data not shown). The main peak for arginine at [3.63, 3.66] Å represents residues interacting with a planar or orthogonal orientation relative to the ring plane. The second peak at [4.38, 4.41] Å corresponds either to gauche conformations or situations where the interaction is altered by neighbouring forces. The lysine peak at [3.75, 3.78] Å corresponds to interactions occurring when the lysine Cϵ-Nζ axis lies on the plane of the carboxylic group of the interaction partner. Above 4 Å this orientation is lost and in most cases false interactions are predicted. Histidine has a peak in the range [4.59, 4.62] Å. Above this distance interactions are spurious.
π–π stacking
π–π stacking interactions involve aromatic side chain rings. According to the reciprocal orientation (Figure 1), it is possible to identify four different categories with different distributions. The orthogonal (N, normal) conformation is found at [5.36, 5.4] Å. Both the parallel (P) and tilted edge-to-face (T-EF) have a maximum at [5.94, 5.99] Å, whereas the L conformation (resembling the letter “L) has a peak at [6.03, 6.08] Å. In general, beyond 7 Å, a π–π interaction is unreliable, since the straight line connecting the rings either passes through other atoms or the side chains point in opposite directions.
π–cation
π–cation interactions involving arginine are characterized by different orientations of the charged group relative to the ring plane of the partner (Figure 1). Parallel, lateral and normal orientations have a peak at [3.64, 3.68] Å, [4.28, 4.32] Å and [4.56, 4.60] Å respectively. The orthogonal (N) conformation presents another relevant peak around [6.12, 6.16] Å, corresponding to a situation where the charged group is found opposite to the interacting ring. Interactions involving lysine are more difficult to interpret, as they have a peak around [4.40, 4.44] Å and above ca. 5 Å only spurious interactions are found.
Visualizing RINs in PDB structures
RINs generated by RING can be visualized directly in the structure thanks to the RING-Viz script for Pymol. The script is invoked from the command line, taking the RING network and corresponding PDB structure file as input. RING-Viz works out of the box in both Windows and Linux systems requiring only Pymol as dependency. The script also accepts other parameters to customize edge rendering or to filter interactions by type, distance, orientation and node identifier. Once the script finishes loading, nodes and edges appear as new objects corresponding to different interaction types. In this way, it is possible to customize the node and edge view transparently as for normal atom selections.
Server implementation
The RING web server is implemented in Node.js (https://nodejs.org) using the REST (Representational State Transfer) architecture and can be accessed through the web interface or programmatically exploiting the REST functionality. The web interface is built using the Angular.js (https://angularjs.org) framework and Bootstrap CSS style (http://getbootstrap.com). The network layout in the result page is calculated on the client-side exploiting a force-directed algorithm provided by the D3.js library (http://d3js.org). Unlike the previous version, RING-2.0 is now also available as stand-alone package. Hydrogen bonds and VDW interactions as well as secondary structure are calculated without the need for external tools. RING-2.0 is now also able to return both intra- and inter-chain interactions and contacts involving hetero atoms. Moreover, newly optimized distance thresholds are available as built-in defaults in strict and relaxed versions. The RING output network is provided both in text and GraphML (XML) formats, improving compatibility and integration with external tools.
SERVER DESCRIPTION
Input
The RING web interface is straightforward to use. The main page features an input box, which accepts either a PDB identifier or a structure file. By default, RING processes the first chain, alternatively the user can select the chain manually or decide to perform the calculation on all chains, obtaining both intra and inter-chain connections. RING compares residues that are not adjacent (i.e. separated) in the sequence. By default, the distance is set to 3, i.e. it compares position i and i + 3, but the user can vary the threshold to further filter local interactions. Two important options are related to the edge cardinality and distance thresholds. RING can return one, multiple or all possible interactions between a node pair. By default, it provides multiple interactions, but only one for each type. Distance thresholds are set automatically, but the user can choose between a stringent and relaxed definition to provide an easy way to generate both inclusive and very reliable networks. The two sets have been defined through large scale analysis, as described in the Methods section. Mutual information and residue conservation (entropy) are calculated on demand, since they require a time consuming PSI-BLAST profile. However, the server is designed to be always very responsive. The output network is generated immediately and missing attributes are added transparently when the calculation finishes.
Output
RING provides the network as an interactive graph on the results page (see Figure 3). Node positions are updated dynamically thanks to a force-directed algorithm that tries to optimize the layout. The layout can also be adjusted manually by modifying the force parameters or dragging nodes with the mouse. Nodes can be coloured to highlight different aspects, like residue chemical propensity, vertex degree, secondary structure, mutual information and conservation (when available). Additional details are shown on a tooltip when the mouse hovers over a node or edge element. Multiple connections between nodes are shown as curved lines and ‘hetero’ molecules are grey circles with a black outline. RING output is also provided as different files, including the network in both GraphML (XML) and text format, the processed PDB structure with hydrogen atoms and the vector image (SVG) of the graph. The network can be loaded in Cytoscape (http://www.cytoscape.org) and visualized in the structure by running the RING-Viz program (see ‘Materials and Methods’ section), which is able to draw atomic level connections in Pymol (https://www.pymol.org). The XML network file can also be used by the RINAlyzer/StructureViz (35) plugin to synchronize residue selection in Cytoscape with the 3D visualization in Chimera (23). Detailed instructions and examples are available in the tutorial and information about output formats in the help section of the website.
Figure 3.
RING result page for the human p27Kip kinase inhibitory domain bound to the phosphorylated cyclinA-Cdk2 complex (PDB code: 1JSU). The top-left graph shows the RIN with nodes and edges coloured according to the legend in the top right part. Highlighted interactions are shown in the lateral panels (structures in cartoon representation, interacting residues as sticks) and have been generated using the RING-Viz script (see ‘Materials and Methods’ section). The three inserts represent the same network graph with different colouring schemes. Clockwise from the top-right corner, the highlighted node attributes are: mutual information, conservation and node degree.
RING result page for the human p27Kip kinase inhibitory domain bound to the phosphorylated cyclinA-Cdk2 complex (PDB code: 1JSU). The top-left graph shows the RIN with nodes and edges coloured according to the legend in the top right part. Highlighted interactions are shown in the lateral panels (structures in cartoon representation, interacting residues as sticks) and have been generated using the RING-Viz script (see ‘Materials and Methods’ section). The three inserts represent the same network graph with different colouring schemes. Clockwise from the top-right corner, the highlighted node attributes are: mutual information, conservation and node degree.
Usage example
Cyclin-dependent kinase (CDK) inhibitors play a central role in the regulation of eukaryotic cell cycles (16,36). A common event during malignant cancer progression is the deregulation of cell cycle phase transitions due to mutations frequently inactivating the kinase inhibitor activity. p27Kip, a member of the Kip/Cip protein family, is known to act as a tumour suppressor protein (37). It is an intrinsically disordered protein (38) lacking a hydrophobic core characterized by consecutive secondary structure elements not interacting with each other. This extended conformation is used to form a relatively large contacting surface allowing multiple types of interactions with its binding partners (39). Here, we used RING 2.0 to analyse the crystal structure of p27Kip bound to the cyclinA-Cdk2 complex (PDB code: 1JSU) and show how the web server can be used to easily retrieve information from a crystal structure. The generated RIN, covering three different chains, has a total of 510 nodes and 722 edges (see Figure 3). In the first panel (top-left corner), nodes are coloured by chain and both π–π and ionic interaction lines are thicker. Visual inspection revealed both intra- and inter-chain clusters of π–π interactions. One cluster (blue circle in the figure) represents the residues connecting the p27Kip-Cdk2 chains. The inhibitor uses a β-strand to clamp around a β-sheet of Cdk2. This specific interaction induces a structural change which contributes to kinase inactivation. A similar cluster representation was also generated for the p27 Kip LGF binding motif. It lies in a shallow groove of cyclinA formed by the α1, α3 and α4 helices of the cyclin-box repeat (red circle in the figure). Ring 2.0 highlights important interactions at a glance, allowing a fast and useful recognition of functional residues. The three inserts in Figure 3 highlight other graph representations. Mutual information, conservation (entropy) and node degree are shown clockwise from the top-right corner. In general, node degree correlates with conservation and both provide indications on key residues. Mutual information is calculated only for intra-chain connections and highlights residues relevant for structural stability. It is interesting to note that chain C (green nodes) lacks residues with high mutual information values (pale-blue nodes in the top-right panel). This is not surprising as chain C lacks a hydrophobic core. Intra-chain contacts in elongated, disordered structures are less important and therefore less sensitive to correlated mutations.
CONCLUSIONS
We have presented RING 2.0, a new version of the RING software, for identification of both covalent and non-covalent bonds in protein structures. RING 2.0 is extremely fast and generates both intra- and inter-chain interactions while also considering ‘hetero atoms’, i.e. solvent, ligand and DNA/RNA atoms. A new empirical re-parameterization of distance thresholds was performed on the entire PDB repository, ensuring a more reliable detection of real interactions. By default, RING output is generated with optimal parameters, but the web server provides an exhaustive interface to customize calculations. The network can be visualized directly on the web server or in Cytoscape. Alternatively, the RING-Viz script for Pymol allows visualizing atomic level interactions in the structure.
Authors: Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker Journal: Genome Res Date: 2003-11 Impact factor: 9.043
Authors: David D Boehr; Jason R Schnell; Dan McElheny; Sung-Hun Bae; Brendan M Duggan; Stephen J Benkovic; H Jane Dyson; Peter E Wright Journal: Biochemistry Date: 2013-06-24 Impact factor: 3.162
Authors: Gregory L Szwabowski; Paige N Castleman; Chandler K Sears; Lee H Wink; Judith A Cole; Daniel L Baker; Abby L Parrill Journal: J Comput Aided Mol Des Date: 2020-07-31 Impact factor: 3.686
Authors: Claudio Semplicini; Cinzia Bertolin; Luca Bello; Boris Pantic; Francesca Guidolin; Sara Vianello; Francesco Catapano; Irene Colombo; Maurizio Moggio; Bruno F Gavassini; Giovanna Cenacchi; Valentina Papa; Marco Previtero; Chiara Calore; Gianni Sorarù; Giovanni Minervini; Silvio C E Tosatto; Roberto Stramare; Elena Pegoraro Journal: Neurology Date: 2018-09-26 Impact factor: 9.910