Literature DB >> 28250616

The relationship between relative solvent accessible surface area (rASA) and irregular structures in protean segments (ProSs).

Divya Shaji1.   

Abstract

Intrinsically Disordered Proteins (IDPs) lack a stable, three-dimensional structure under physiological conditions, yet they exhibit numerous biological activities. Protean segments (ProSs) are the functional regions of intrinsically disordered proteins that undergo disorder-to-order transitions upon binding to their partners. Example ProSs collected from the intrinsically disordered proteins with extensive annotations and literature (IDEAL) database. The interface of protean segments (ProSs) is classified into core, rim, and support, and analyzed their secondary structure elements (SSEs) based on the relative accessible surface area (rASA). The amino acid compositions and the relative solvent accessible surface areas (rASAs) of ProS secondary structural elements (SSEs) at the interface, core and rim were compared to those of heterodimers. The average number of contacts of alpha helices and irregular residues was calculated for each ProS and heterodimer. Furthermore, the ProSs were classified into high and low efficient based on their average number of contacts at the interface. The results indicate that the irregular structures of ProSs and heterodimers are significantly different. The rASA of irregular structures in the monomeric state (rASAm) is large, leads to the formation of larger ΔrASA and many contacts in ProSs.

Entities:  

Keywords:  Intrinsically disordered proteins; protean segments (ProSs); protein interface; protein-protein interactions; relative solvent accessible surface area (rASA); secondary structure elements (SSEs)

Year:  2016        PMID: 28250616      PMCID: PMC5314839          DOI: 10.6026/97320630012381

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

An intrinsically disordered protein (IDP) is a protein that is disordered (as a whole or in part) in the unbound state and undergoes a disorder-to-order transition upon binding to their partners [1,2,3]. These IDPs have numerous biological activities such as signal transduction and transcriptional regulation and are highly abundant in nature [2,4]. These proteins are associated with various human diseases, including cancer, cardiovascular disease, neurodegenerative diseases and amyloidoses [5,6,7]. Due to their role in various biological processes and their involvement in various diseases, IDPs are the focus of many biomedical-related areas and represent attractive novel drug targets [7,8]. Protean segments (ProSs) are the functional regions of intrinsically disordered proteins that undergo disorder-to-order transitions upon binding to their partners (i.e., coupled folding and binding) [9,10, 11,12]. The ProS interface is composed of a small core and a large rim. The average number of contacts of ProS interface with its interaction partners is greater than that of heterodimers. This indicates the effective interactions of ProSs that take place in the rim region like core. The key to effective interactions of ProSs is the solvent exposure of rim residues in the monomeric state (rASAm) [13]. The goal of this work is to investigate the properties of secondary structure elements (SSEs) at the interface of ProSs relative to those of heterodimers. The interfaces of ProSs and heterodimers were classified into the core, rim, and support based on their relative solvent accessible surface area (rASA) [14]. The average number of contacts of alpha helices and irregular residues was calculated for each ProS and heterodimer. Furthermore, the ProSs were classified into high and low efficient ProSs based on their average number of contacts at the interface. Compared to heterodimers, irregular residues of ProSs have larger number of contacts than their alpha helices. Moreover, irregular residues of ProSs have larger ΔrASA than their alpha helices. The rASA of irregular structures in the monomeric state is large, that leads to the formation of larger ΔrASA and many contacts in ProSs. In addition, high efficient ProSs have larger average rASA in the monomeric state (rASAm) and larger average ΔrASA, than low efficient ProSs.

Materials and Methods

ProSs and heterodimers

All ProSs (210) in 70 protein sequences were collected from the IDEAL database (as of August 2013) [11,12]. If more than one ProS were found in a protein and their positions overlapped, the longest ProS was selected. The sequence redundancy was removed with 80% sequence similarity (based on the CLUSTALW alignment) [15]. Hierarchical clustering was done with R [16] using completelinkage clustering and the longest ProS in a cluster was selected as the representatives. A non-redundant set contained 99 ProSs [13]. DNA-binding ProSs and one-to-many binding ProSs (a single ProS binds to two or more different partners), were discarded [17]. Both the X-ray and NMR structures were used in this study. A non-redundant dataset of 276 heterodimers was selected from the Protein Data Bank (PDB) [18], using the PDB’s advanced search interface (as of July 2014). The search criteria satisfied the following conditions: (1) less than 30% sequence identity; (2) the macromolecule type contained only proteins; (3) the oligomeric state was heterodimer; (4) each chain was greater than 100 residues; and (5) structures determined by X-ray crystallography had higher than 3 Å resolutions. Only smaller protomers were analyzed as the reference of ProSs.

Secondary structure analysis

The program DSSP [19] was used to assign secondary structures. The eight types calculated by DSSP were reduced to three, such as alpha helices (H, G and I), beta strands (E) and irregulars (B, S, T and C). The amino acid propensity, average number of contacts and relative solvent accessible surface areas (rASAs) of alpha helices and irregulars were analyzed in detail.

Calculation of amino acid propensities

The propensities of amino acids are represented as the Chou– Fasman parameters [20], CF (a,P) = Na(P)/N(P)/(Naall/Nall), where Na (P) is the number of amino acid residue a in place P, N (P) is the total number of residues in P, Naall is the total number of amino acid residue a in the protein sequence, and Nall is the total number of residues in the protein sequence. In P, the alpha helix and irregular residues of ProSs and heterodimers were considered. To calculate the reference states (the denominator), the same secondary structure types of PDBSelect25 [21] proteins were used. PDBSelect25 contains a representative set of PDB entries with less than 25 % sequence identity.

Analysis of relative ASA (rASA) and residue contacts

The interfaces of each ProSs and heterodimer were classified into the core, rim and support based on the definitions of Levy [14]. The relative solvent accessible surface area (rASA) of a residue indicates a degree of residue solvent exposure. It can be calculated by normalizing the total accessible surface area (ASA) of the residues in a protein structure by the ASA of the residues in the most exposed state to a solvent molecule [22]. The program Naccess [23], which is an implementation of Lee and Richard’s algorithm [24] were used to calculate the rASA of each residue in the monomeric (rASAm) and complex states (rASAc) for ProSs and heterodimers. The change in relative solvent accessible surface area (ΔrASA) of each residue was calculated as the difference between the rASAs of monomeric (rASAm) and complex states (ASAc). The rASAs were averaged for the interface, core and rim residues, to derive the average rASAs of proteins. Two residues, i and j, were considered to be in contact if any atom of residue i was within a distance of < 4.5 Å with any atom of residue j [25,26]. The average number of external contacts and relative solvent accessible surface areas (rASAs) at the interface, core and rim in alpha helices and irregular residues were calculated for each ProS and heterodimer. External contacts are defined as the contacts between the proteins and their interaction partners. The support and beta strand residues were discarded from this study because of their shortage in ProSs.

High and low efficient ProSs

Based on the average number of contacts in the interface, the ProSs were classified into high and low efficient ProSs. High and low efficient ProSs were defined as the contacts of ProSs with greater than 4 and less than 2.5, respectively. Short ProSs (less than 11 residues) were discarded from this classification. Several properties were analyzed for each high and low efficient ProSs (See Results and Discussion). The datasets contain 11 and 14 ProSs for high and low efficient, respectively [13]. The radius of gyration was calculated using Bio3D package [27] in R [16].

Statistical analysis

Wilcoxon rank-sum test was performed by RStudio [28] to calculate the P-values. P < 0.01 was considered statistically significant.

Results and Discussion

Secondary structure analysis of ProSs and heterodimers

The secondary structure assignments for each of the ProS and heterodimer interface were determined by the DSSP program [19]. This analysis (See Figure 1A,B) showed that 33% of the residues in the ProSs dataset were alpha helices, 6% were beta strands, and 61% were residues of the irregular structure. The secondary structure distribution of ProSs interface is very different from those of heterodimers. The content of irregular structures and beta strands are the largest difference between ProSs and heterodimers. Alpha helices are almost equally abundant in both data sets. ProS interface contains 15% more irregular residues, 13% fewer beta strands and 2% fewer alpha helices than heterodimers. The differences between the distributions were evaluated, and the boxplots of the rates of alpha helices and irregulars are shown in Figure 1C and D. The alpha helix residues of ProSs and heterodimers are not significantly different (P-value = 0.03). It is important to note that, the irregular structures of ProSs and heterodimers are significantly different (P-value = 1.05e-07).
Figure 1

Distribution of secondary structure elements (SSEs) in ProS and heterodimer interface. The composition of secondary structure elements (SSEs) in ProS interface (A) and heterodimer interface (B). The program DSSP was used to assign secondary structures. The eight types calculated by DSSP [19] were reduced to three, such as alpha helices, beta strands, and irregulars. The distributions of alpha helices, beta strands and irregulars are colored in green, violet and yellow, respectively. Because of the shortage of beta strand residues in ProSs, alpha helices and irregulars were considered for further analysis. Box-plots of the rates of (C) alpha helix residues in ProSs (red) and heterodimers (blue) interface (D) irregular residues in ProSs and heterodimers interface. The distribution of the irregulars is significantly different as assessed by the Wilcoxon rank-sum test (alpha helices = 0.03, irregulars = 1.05e-07).

Interactions of secondary structure elements (SSEs)

The amino acid propensities of the different secondary structure elements (SSEs) (alpha helices and irregular structures) for ProSs vs. heterodimers were examined. The Chou–Fasman parameters [20] for alpha helix and irregular residues at the interface were calculated. In Figure 2A and B, the correlations between ProS alpha helices vs. heterodimer alpha helices and ProS irregulars vs. heterodimer irregulars at the interface are indicated. In both cases, positive correlations were observed with 0.50 and 0.61 for alpha helix and irregular residues, respectively. This indicates that the amino acid composition of the ProSs secondary structural elements (SSEs) is moderately similar to that of heterodimers.
Figure 2

Scatter plots of the Chau–Fasman parameters [20] of alpha helices and irregulars at the interface (A) ProS alpha helices vs. heterodimer alpha helices. (B) ProS irregulars vs. heterodimer irregulars.

The core residues at the interface are the hydrophobic residues, generally in the central region of the interface, and play an important role in the interaction. The rim residues are the polar residues, located on the outer edges of the interface. The support residues represent the intersection between the interior and the interface [14]. Previous studies have been indicated that the ProS interface can be in contact with a larger number of residues of the interaction partners compared with the heterodimer interface [13,29]. To examine the efficiency of interactions in different secondary structural elements (SSEs), the average number of external contacts of the interface, core, and rim residues were calculated for each ProS and heterodimer (see Figure 3 A-F). Compared to heterodimers, irregular residues of ProSs have a larger number of contacts than their alpha helices. In Table 1 and Table 2, the P-values of alpha helices and irregulars are shown respectively, for the interface, core, and rim.
Figure 3

Interactions of the secondary structure elements (SSEs) in ProSs and heterodimers. Box-plots of the average number of contacts of the alpha helices and irregulars in ProSs and heterodimers at the interface (A and B), core (C and D) and rim (E and F). The distributions of ProSs and heterodimers are colored in red and blue, respectively (these colors are used throughout this paper). The differences between the distributions were evaluated, and the P-values are shown in Table 1 and 2.

Table 1

P-values of alpha helices in ProSs and heterodimers (using the Wilcoxon rank- sum test)

FeaturesPlacesP-values
Average number of contactsInterface1.19E-18
Core 3.47E-16
Rim0.0002
Average rASAmInterface3.26E-19
Core 1.22E-13
Rim4.04E-05
Average rASAcInterface2.95E-05
Core 0.008
Rim0.044
Average ΔrASAInterface1.15E-18
Core 2.47E-14
Rim0.015
Table 2

P-values of irregular structures in ProSs and heterodimers (using the Wilcoxon rank- sum test)

FeaturesPlacesP-values
Average number of contactsInterface2.36E-13
Core 1.89E-09
Rim1.19E-16
Average rASAmInterface 2.15E-44
Core 4.15E-17
Rim1.80E-29
Average rASAcInterface 3.52E-33
Core 0.0009
Rim4.16E-18
Average ΔrASAInterface 1.24E-14
Core 9.79E-13
Rim5.21E-12

Relative ASA (rASA) of secondary structure elements (SSEs)

Our previous study showed that the average ΔrASA correlates well with the average number of contacts in ProSs [13]. ΔrASA of each residue is defined by the difference between rASA of the unbound state (rASAm) and that of the bound state (rASAc), and both rASAs are used to define the core, rim and support residues (ΔrASA = rASAm -rASAc) [14]. Here, the relative solvent accessible surface areas (rASAs) of the alpha helices and irregular structures in each ProS and heterodimer at the interface, core and rim were analyzed in detail. In Figure 4 A-C, D-F, and G-I, the distribution of the average rASAm, rASAc and ΔrASA of ProS alpha helices is shown respectively, for the interface, core, and rim, and compared with those of heterodimers. Similarly, in Figure 5 A-C, D-F and G-I, the distribution of the average rASAm, rASAc and ΔrASA of ProS irregulars is shown respectively, for the interface, core, and rim, and compared with those of heterodimers. In both the core and rim, irregular residues of ProSs have a larger rASA in the monomeric state than heterodimers. The differences are confirmed by a statistical test (See Table 1 and Table 2). The rASA of ProS irregular residues in the monomeric state (rASAm) is large, resulting in a larger ΔrASA, leads to the formation of many contacts. Contour plots of average rASAm and rASAc of alpha helices and irregular structures are shown in Figure 7 and Figure 8.
Figure 4

Average rASAs of alpha helices in ProSs and heterodimers. Average rASAm at the interface (A), core (B) and rim (C). Average rASAc at the interface (D), core (E) and rim (F). Average ΔrASA at the interface (G), core (H) and rim (I). The differences between the distributions were evaluated, and the Pvalues are shown in Table 1.

Figure 5

Average rASAs of irregulars in ProSs and heterodimers. Average rASAm at the interface (A), core (B) and rim (C). Average rASAc at the interface (D), core (E) and rim (F). Average ΔrASA at the interface (G), core (H) and rim (I). The differences between the distributions were evaluated, and the P-values are shown in Table 2.

Figure 7

Contour plots of the average rASAm and average rASAc in alpha helices. (A) Average rASAm vs. average rASAc of the ProS core. (B) Average rASAm vs. average rASAc of the ProS rim. (C) Average rASAm vs. average rASAc of the heterodimer core. (D) Average rASAm vs. average rASAc of the heterodimer rim. The rASAs of each residue in the monomeric and in the complexed states in ProSs and heterodimers were calculated using Naccess [23]. The highest density regions are shown in red, and the lowest density regions are in green.

Figure 8

Contour plots of the average rASAm and average rASAc in irregulars. (A) Average rASAm vs. average rASAc of the ProS core. (B) Average rASAm vs. average rASAc of the ProS rim. (C) Average rASAm vs. average rASAc of the heterodimer core. (D) Average rASAm vs. average rASAc of the heterodimer rim. The rASAs of each residue in the monomeric and in the complexed states in ProSs and heterodimers were calculated using Naccess [23]. The highest density regions are shown in red, and the lowest density regions are in green.

Based on the average number of contacts at the interface, the ProSs were classified into high and low efficient ProSs (See Methods). To examine the properties of high efficient ProSs, several factors, such as average rASAm, average rASAc, average ΔrASA, rate of the interface, rate of the core, rate of the rim, radius of gyration (Rg), and length of the ProSs for each high and low efficient ProS were analyzed. Boxplots of the distributions of high and low efficient ProSs are shown in Figure 6 A-H . P-values of the high and low efficient ProSs are shown in Table 3.
Figure 6

Box plots of the high and low efficient ProSs. The ProSs are classified into high and low efficient ProSs based on the average number of contacts at the interface. The high efficient ProS is shown in pink, and the low efficient ProS is shown in orange. The P-values are shown in Table 3.

Table 3

P-values of high and low efficient ProSs (using the Wilcoxon rank-sum test)

FeaturesP-values
Average rASAm0.0007
Average rASAc0.403
Average ΔrASA8.97E-07
Rate of the interface0.028
Rate of the core0.546
Rate of the rim0.366
Length of the ProSs0.02
Normalized Rg0.228
The radius of gyration is used to describe the compactness of a protein, as well as the folding process from the denatured state to the native state [30,31]. The results show that there is no significant difference between the normalized radiuses of gyration (Rg) of high and low efficient ProSs. Similarly, the factors, such as average rASAc, rate of the interface, rate of the core, rate of the rim, and length of the ProSs are not statistically significant in both high and low efficient ProSs. The reason for this may be the low number of protean segments (ProSs) in the high and low efficient datasets. Interestingly, only the average rASAm and average ΔrASA are statistically significant. This confirms the hypothesis that average rASA in the monomeric state (rASAm) plays a major role in the efficient interactions of ProSs [13].

Conclusion

The properties of secondary structure elements (SSEs) at the interface, core, and rim of ProSs were analyzed relative to those of heterodimers. The results demonstrate that irregular structures of ProSs and heterodimers are significantly different. Irregular structures have a larger rASA in the monomeric state (rASAm) that leads to the formation of many contacts in ProSs.
  26 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

Review 2.  Coupling of folding and binding for unstructured proteins.

Authors:  H Jane Dyson; Peter E Wright
Journal:  Curr Opin Struct Biol       Date:  2002-02       Impact factor: 6.809

Review 3.  Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm.

Authors:  P E Wright; H J Dyson
Journal:  J Mol Biol       Date:  1999-10-22       Impact factor: 5.469

4.  Side-chain clusters in protein structures and their role in protein folding.

Authors:  J Heringa; P Argos
Journal:  J Mol Biol       Date:  1991-07-05       Impact factor: 5.469

5.  A simple definition of structural regions in proteins and its use in analyzing interface evolution.

Authors:  Emmanuel D Levy
Journal:  J Mol Biol       Date:  2010-09-22       Impact factor: 5.469

6.  Amino acid interaction preferences in proteins.

Authors:  Anupam Nath Jha; Saraswathi Vishveshwara; Jayanth R Banavar
Journal:  Protein Sci       Date:  2010-03       Impact factor: 6.725

7.  Hydrophobicity of amino acid residues in globular proteins.

Authors:  G D Rose; A R Geselowitz; G J Lesser; R H Lee; M H Zehfus
Journal:  Science       Date:  1985-08-30       Impact factor: 47.728

Review 8.  Bioinformatical approaches to characterize intrinsically disordered/unstructured proteins.

Authors:  Zsuzsanna Dosztányi; Bálint Mészáros; István Simon
Journal:  Brief Bioinform       Date:  2009-12-10       Impact factor: 11.622

Review 9.  Intrinsically disordered proteins in human diseases: introducing the D2 concept.

Authors:  Vladimir N Uversky; Christopher J Oldfield; A Keith Dunker
Journal:  Annu Rev Biophys       Date:  2008       Impact factor: 12.981

10.  IDEAL in 2014 illustrates interaction networks composed of intrinsically disordered proteins and their binding partners.

Authors:  Satoshi Fukuchi; Takayuki Amemiya; Shigetaka Sakamoto; Yukiko Nobe; Kazuo Hosoda; Yumiko Kado; Seiko D Murakami; Ryotaro Koike; Hidekazu Hiroaki; Motonori Ota
Journal:  Nucleic Acids Res       Date:  2013-10-30       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.