Literature DB >> 22080558

HotRegion: a database of predicted hot spot clusters.

Engin Cukuroglu¹, Attila Gursoy, Ozlem Keskin.

Abstract

Hot spots are energetically important residues at protein interfaces and they are not randomly distributed across the interface but rather clustered. These clustered hot spots form hot regions. Hot regions are important for the stability of protein complexes, as well as providing specificity to binding sites. We propose a database called HotRegion, which provides the hot region information of the interfaces by using predicted hot spot residues, and structural properties of these interface residues such as pair potentials of interface residues, accessible surface area (ASA) and relative ASA values of interface residues of both monomer and complex forms of proteins. Also, the 3D visualization of the interface and interactions among hot spot residues are provided. HotRegion is accessible at http://prism.ccbb.ku.edu.tr/hotregion.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Substances：
Multiprotein Complexes

Year: 2011 PMID： 22080558 PMCID： PMC3245113 DOI： 10.1093/nar/gkr929

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Proteins interact with other proteins through their interfaces in order to fulfill their functions. Interfaces are formed by residues whose properties determine binding specificity and affinity. Correct orientations of the residues are critical for complex formation. Interactions between the residues in the binding sites are higher than the protein surface which shows that protein–protein interactions are highly depending on the cooperativity of the residues (1). Some proteins interact with one or two proteins. Some other proteins, called hub proteins, may interact with many proteins as many as tens of other proteins. It is physically impossible for these hub proteins to interact with all its partners at the same time, since the surface area of the hub protein is fixed. This suggests that there are binding sites that should be used repeatedly to bind different proteins (2–4), probably each with different affinity and specificity. The distribution of the residues across the interface and the residue–residue interactions may answer the question ‘How can the interfaces recognize their partners?’. The residues tend to behave cooperatively during the interactions and they form modules in the interface (5). Proteins may utilize these modules in order to have specificity and affinity during interactions (6–9) and also the combinations of these modules yield a powerful mechanism for binding multiple partners via unique interfaces (10,11). Also, Chakrabarti and Janin (12) stated that small binding sites have single continuous patch; however, larger interfaces may have several patches. Previously, modules in interfaces are defined with various methods such as (i) the edge betweenness criteria in the residue–residue interaction network across the interface (7,13), (ii) difference of energy profiles of residues in interfaces (6,14,15), and (iii) clustering of structurally conserved residues in interfaces (9,11,16). In the edge betweenness approach, the authors used the topology of the network without considering residue energy profiles. The other two approaches used hot spot residues which are driven by energy profiles or structural conservation of residues. The residues that contribute more to the binding free energy are called ‘hot spots’ (17–19). Hot spots are tightly packed and structurally conserved residues (9,11,16). Also Keskin et al. (9) showed that these hot spot residues are not randomly distributed along the protein–protein interfaces; rather clustered. Besides, there is a correlation between energy change and decrease in the accessible surface area of these hot spots (20). Also, the cooperativity of the hot spot residues enlightens the complex binding organizations of the protein–protein interfaces (21,22,23). Computational methods (24–30) are widely used to extract hot spot information from interface, because experimental studies are available for a very limited number of complexes. In this work, we combine the residue network topology with the residue energy profile based clustering approaches. The residue clusters in interfaces are called ‘hot regions’ (9,22). As we showed in our previous study (22), hot regions are useful to interpret the protein interface properties. Here, we present our database ‘HotRegion’ in order to illustrate hot spot cooperativity information at protein–protein interfaces. HotRegion stores all available protein–protein interfaces which are extracted from Protein Data Bank (PDB) (31) entries using a dynamic update system which is based on the user’s search queries. If a user searches for hot regions via a PDB ID which is not in the HotRegion database, the database can rapidly update itself and show the results. We hope the database will help in detecting cooperativity of functionally important residues, mutagenesis targets and understand the stability and specificity of protein–protein interfaces.

HOTREGION METHOD

An interface is the contact region between two interacting proteins. Two residues are defined to be contacting if the distance between any two atoms of the two residues from different chains is less than the sum of their corresponding van der Waals radii plus 0.5 Å (32,33). Hotspot residues in interfaces are predicted with HotPoint (28) using accessible surface area (ASA) and knowledge based pair energies of each residue (34). In order to define hot regions, a network of hotspots is constructed. In the network, the nodes are the hotspot residues and the edges are linked between nodes if the two hotspot residues are in contact. Two hotspot residues are defined as contacting if the distance between their Cα atoms is smaller than 6.5 Å (9). Afterwards, connected components of the network are found and if the nodes in a connected component are equal or greater than three, the connected component is labeled as a hot region and the hotspot residues in this connected component labeled as the members of this hot region.

DATABASE PROPERTIES

The HotRegion database is available at http://prism.ccbb.ku.edu.tr/hotregion. HotRegion embraces three major components: a relational database management system for data storage and management, a web application to interface the database and a dynamically database update system. Data are stored in a relational MySQL database. The web application runs on an Apache web server hosted on a linux-based system. PHP and JavaScript are used to implement the web application. The database can be updated dynamically.

DATABASE CONTENT

Currently, HotRegion contains all the PDB entries as of January 2011 (70 695 PDB entries, 147 892 protein–protein interfaces). If a user searches hot region information of a protein complex (via PDB ID) which is not in the HotRegion, the database can rapidly update itself and show the results. HotRegion has only protein–protein interface information. HotRegion database offers the researchers to find the hot regions of the protein complexes and provides structural properties of these complexes such as pair potentials of interface residues, ASA and relative ASA values of interface residues of both monomer and complex forms of proteins. Also, the visualization of the interface by using Jmol (35) and residue networks of interactions of hot spot residues are presented in the results. An advanced search option is also available. Users can manipulate the HotRegion parameters by changing default values in advanced search section. Advanced searches are deposited in the database and users can retrieve their jobs by using an email and job id from the ‘Retrieve Job’ section. HotRegion needs atomic coordinates of the protein complexes in the standard PDB format. If atoms are present in alternative locations, only the first location is considered. For NMR structures, the first model is used. HotRegion is specific to protein–protein interfaces; chains corresponding to DNA and RNA structures return no interface solutions. If users do not supply enough information, the database asks for the missing information. The HotRegion database is free, open to all users and there are no login requirements.

TUTORIAL

Simple search

Users retrieve the data of protein interfaces just by entering a PDB ID and two chain identifiers. Between the given monomers there must be an interface in order to get the hot region information. Also users have a control over the presentation of the results. Three properties of the interface (residue number, residue type, chain id) are always displayed in the result table and the output file, and the rest are displayed based on the preferences (Figure 1).

Figure 1.

Properties of HotRegion Database in a quick view.

Advanced search

Users can retrieve the data based on their interface and hot region finding criteria. Users must enter email information in order to retrieve their jobs afterwards. They can supply a PDB file or enter a PDB code. After entering the chain information of the monomers that have an interface between them, users can decide a valid interface extraction threshold which is summed with van der Waals radii of atoms. When the van der Waals threshold gets bigger, the number of interface residues will increase. Also users can change the hot spot neighbor criterion which is the Cα distance between the hot spots. When the hot region criterion gets bigger, the number of hot regions will decrease and hot regions start to merge in order to build larger hot regions.

Retrieve Job

The returning users can retrieve results of previous jobs by using the job ids and their email addresses.

CASE STUDY

Contribution to binding affinity of the proteins

Colicins are plasmid-encoded, stress-induced protein antibiotics that specifically target Escherichia coli cells. When it binds to a specific (cognate) partner, the nuclease can protect the organism from endogenous and incoming colicin (36). Kleanthous and coworkers (37) showed that a limited number of mutations at the interface provide high-affinity binding to a non-cognate partner. According to this work, a non-cognate complex between the colicin E9 endonuclease (E9 DNase) and immunity protein 2 (Im2) (PDB Id: 2WPT) has a weaker binding affinity than the cognate femtomolar E9 DNase—Im9 (PDB Id: 1EMV) interaction. When they substitute three Im2 residues with their Im9 counterparts (Im2 D33L/N34V/R38T) the binding energy is almost similar to the binding energy of cognate complex energy. HotRegion results for these complexes show that the predicted hot spots overlap with the experimental findings. The cognate complex has two hot regions but the non-cognate complex has one hot region (Figure 2) (Table 1). The structural differences at the interface are based on the different side chain orientations. Possibly, cognate complex utilizes the two hot regions at the interface in order to increase the binding affinity of interaction. When we compare the hot regions of both complexes, we observed that the only difference between the hot region residues at the cognate complex are L33 and V34 (they formed the extra hot region with T37 in cognate complex). When these residues are mutated in the non-cognate complex to L and V, these residues may probably form the extra hot region with T37 at non-cognate complex in order to increase the binding affinity of the non-cognate complex as much as the one of the cognate complex.

Figure 2.

(A) Colicin E9 endonuclease (green) interacts with Im9 (purple) and the complex has two hot regions (red and orange). (B) Colicin E9 endonuclease (green) interacts with Im2 (blue) and the complex has one hot region (red).

Table 1.

Hot region results from HotRegion Database for interfaces 1EMVAB and 2WPTAB

Interface	Residue number	Residue type	Chain	Hot region identifier
1EMVAB	33	LEU	A	1
1EMVAB	34	VAL	A	1
1EMVAB	37	VAL	A	1
1EMVAB	50	SER	A	0
1EMVAB	53	ILE	A	0
1EMVAB	54	TYR	A	0
2WPTAB	37	VAL	A	0
2WPTAB	50	SER	A	0
2WPTAB	53	ILE	A	0
2WPTAB	54	TYR	A	0

CONCLUSION

A protein–protein interface consists of two binding sites of two proteins interacting with each other. For all different protein interactions, the binding energies of each complex are miscellaneous and the hot spot residues are distributed in a distinctive pattern. Extracting hot region information from not uniformly distributed binding energy of interfaces is important for analyzing the binding sites of the proteins. Some complexes are built upon more than one hot region, and size of the hot region is changing according to the binding site properties. We have earlier shown that such hot regions (hotspot clusters) are a signature for the protein–protein interfaces especially for hub proteins (22). A hub protein binds different partner proteins by using different hot regions. These networked hotspot organization may imply that the contribution of the hotspots to the stability of the protein–protein complex within a hot region is cooperative. We hope the database will help in detecting cooperativity of functionally important residues, mutagenesis targets and understand the stability and specificity of protein–protein interfaces.

FUNDING

This project has been supported by TUBITAK (Research Grant No 109T343 and 109E207) and The Turkish Academy of Sciences (TUBA). Funding for open access charge: The open access publication charge for this paper has been waived by Oxford University Press - NAR Editorial Board members are entitled to one free paper per year in recognition of their work on behalf of the journal. Conflict of interest statement. None declared.

37 in total

1. The Protein Data Bank.

Authors: H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

2. ConSurf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information.

Authors: A Armon; D Graur; N Ben-Tal
Journal: J Mol Biol Date: 2001-03-16 Impact factor: 5.469

3. Dissecting protein-protein recognition sites.

Authors: Pinak Chakrabarti; Joël Janin
Journal: Proteins Date: 2002-05-15

4. A new, structurally nonredundant, diverse data set of protein-protein interfaces and its implications.

Authors: Ozlem Keskin; Chung-Jung Tsai; Haim Wolfson; Ruth Nussinov
Journal: Protein Sci Date: 2004-04 Impact factor: 6.725

5. The structural and energetic basis for high selectivity in a high-affinity protein-protein interaction.

Authors: Nicola A G Meenan; Amit Sharma; Sarel J Fleishman; Colin J Macdonald; Bertrand Morel; Ruth Boetzel; Geoffrey R Moore; David Baker; Colin Kleanthous
Journal: Proc Natl Acad Sci U S A Date: 2010-05-17 Impact factor: 11.205

6. Long-range cooperative binding effects in a T cell receptor variable domain.

Authors: Beenu Moza; Rebecca A Buonpane; Penny Zhu; Christine A Herfst; A K M Nur-ur Rahman; John K McCormick; David M Kranz; Eric J Sundberg
Journal: Proc Natl Acad Sci U S A Date: 2006-06-20 Impact factor: 11.205

7. Relating three-dimensional structures to protein networks provides evolutionary insights.

Authors: Philip M Kim; Long J Lu; Yu Xia; Mark B Gerstein
Journal: Science Date: 2006-12-22 Impact factor: 47.728

8. Similar binding sites and different partners: implications to shared proteins in cellular pathways.

Authors: Ozlem Keskin; Ruth Nussinov
Journal: Structure Date: 2007-03 Impact factor: 5.006

9. Empirical estimation of the energetic contribution of individual interface residues in structures of protein-protein complexes.

Authors: Mainak Guharoy; Pinak Chakrabarti
Journal: J Comput Aided Mol Des Date: 2009-05-29 Impact factor: 3.686

10. Design of multi-specificity in protein interfaces.

Authors: Elisabeth L Humphris; Tanja Kortemme
Journal: PLoS Comput Biol Date: 2007-07-05 Impact factor: 4.475

33 in total

1. A Structural View of Negative Regulation of the Toll-like Receptor-Mediated Inflammatory Pathway.

Authors: Emine Guven-Maiorov; Ozlem Keskin; Attila Gursoy; Ruth Nussinov
Journal: Biophys J Date: 2015-08-11 Impact factor: 4.033

2. Structural Modeling of GR Interactions with the SWI/SNF Chromatin Remodeling Complex and C/EBP.

Authors: Serena Muratcioglu; Diego M Presman; John R Pooley; Lars Grøntved; Gordon L Hager; Ruth Nussinov; Ozlem Keskin; Attila Gursoy
Journal: Biophys J Date: 2015-08-13 Impact factor: 4.033

3. PredHS: a web server for predicting protein-protein interaction hot spots by using structural neighborhood properties.

Authors: Lei Deng; Qiangfeng Cliff Zhang; Zhigang Chen; Yang Meng; Jihong Guan; Shuigeng Zhou
Journal: Nucleic Acids Res Date: 2014-05-22 Impact factor: 16.971

Review 4. In silico structure-based approaches to discover protein-protein interaction-targeting drugs.

Authors: Woong-Hee Shin; Charles W Christoffer; Daisuke Kihara
Journal: Methods Date: 2017-08-09 Impact factor: 3.608

5. Small Molecules Engage Hot Spots through Cooperative Binding To Inhibit a Tight Protein-Protein Interaction.

Authors: Degang Liu; David Xu; Min Liu; William Eric Knabe; Cai Yuan; Donghui Zhou; Mingdong Huang; Samy O Meroueh
Journal: Biochemistry Date: 2017-03-17 Impact factor: 3.162

6. Exploring Protein-Protein Interactions as Drug Targets for Anti-cancer Therapy with In Silico Workflows.

Authors: Alexander Goncearenco; Minghui Li; Franco L Simonetti; Benjamin A Shoemaker; Anna R Panchenko
Journal: Methods Mol Biol Date: 2017

7. Protein interface remodeling in a chemically induced protein dimer.

Authors: Brian R White; Jonathan C T Carlson; Jessie L Kerns; Carston R Wagner
Journal: J Mol Recognit Date: 2012-07 Impact factor: 2.137

8. mPPI: a database extension to visualize structural interactome in a one-to-many manner.

Authors: Yekai Zhou; Hongjun Chen; Sida Li; Ming Chen
Journal: Database (Oxford) Date: 2021-06-22 Impact factor: 3.451

9. Unraveling the molecular mechanism of interactions of the Rho GTPases Cdc42 and Rac1 with the scaffolding protein IQGAP2.

Authors: E Sila Ozdemir; Hyunbum Jang; Attila Gursoy; Ozlem Keskin; Zhigang Li; David B Sacks; Ruth Nussinov
Journal: J Biol Chem Date: 2018-01-22 Impact factor: 5.157

10. HMI-PRED: A Web Server for Structural Prediction of Host-Microbe Interactions Based on Interface Mimicry.

Authors: Emine Guven-Maiorov; Asma Hakouz; Sukejna Valjevac; Ozlem Keskin; Chung-Jung Tsai; Attila Gursoy; Ruth Nussinov
Journal: J Mol Biol Date: 2020-02-13 Impact factor: 5.469