Literature DB >> 18364713

The role of disorder in interaction networks: a structural analysis.

Philip M Kim¹, Andrea Sboner, Yu Xia, Mark Gerstein.

Abstract

Recent studies have emphasized the value of including structural information into the topological analysis of protein networks. Here, we utilized structural information to investigate the role of intrinsic disorder in these networks. Hub proteins tend to be more disordered than other proteins (i.e. the proteome average); however, we find this only true for those with one or two binding interfaces ('single'-interface hubs). In contrast, the distribution of disordered residues in multi-interface hubs is indistinguishable from the overall proteome. Surprisingly, we find that the binding interfaces in single-interface hubs are highly structured, as is the case for multi-interface hubs. However, the binding partners of single-interface hubs tend to have a higher level of disorder than the proteome average, suggesting that their binding promiscuity is related to the disorder of their binding partners. In turn, the higher level of disorder of single-interface hubs can be partly explained by their tendency to bind to each other in a cascade. A good illustration of this trend can be found in signaling pathways and, more specifically, in kinase cascades. Finally, our findings have implications for the current controversy related to party and date-hubs.

Entities: Chemical Disease Species

Mesh：

Substances：

Year: 2008 PMID： 18364713 PMCID： PMC2290937 DOI： 10.1038/msb.2008.16

Source DB: PubMed Journal: Mol Syst Biol ISSN： 1744-4292 Impact factor: 11.429

Introduction

There have been many advances in the study of protein interaction networks enabled by the advent of high-throughput technology (Barabasi and Oltvai, 2004). Recent studies have started to put these networks into the context of 3D protein structures (Aloy and Russell, 2006; Kim ). Many genomic properties that had been previously linked to topological properties were shown to be better described by structural quantities. In particular, the notion of network hubs was refined to two different kinds of hubs, ‘single' (or singlish)-interface and multi-interface hubs (Kim ). The former have only few interaction interfaces (two at most) and tend to be enriched in signaling proteins, whereas the latter correspond to central members of larger protein complexes. In contrast to the classical view of structured proteins, the concept of intrinsically disordered regions has recently emerged (Dunker ; Linding ; Iakoucheva ; Radivojac ). Disordered regions are segments of a protein that does not completely fold and remains flexible and unordered. Computational predictions of disordered regions have found that, although proteomes of archaea and bacteria comprise only a small fraction of intrinsically disordered proteins (about 2–4%), eukaryotic proteomes include a large fraction (about 33%) of long regions that are natively disordered and thus do not adopt a fixed structure (Ward ). The functions of disordered regions have been classified into four categories: molecular recognition, molecular assembly, protein modification, and entropic chain activity (Wright and Dyson, 1999; Sugase ). Disordered regions of proteins have been shown to have key physiological roles, for example, are involved as communicators in many cellular signaling pathways. In particular, the target sites of both protein kinases and many modular protein domains (such as SH3, PDZ, SH2 etc.) generally lie in disordered regions (Iakoucheva ; Beltrao and Serrano, 2005; Fuxreiter ), presumably because disordered regions are more prone to present the short linear motifs that these domains and kinases bind to. The initial studies on structural networks did not examine the role of disorder (Kim ; Beltrao ; Devos and Russell, 2007). In this work, we make the first rigorous investigation of disorder in structural networks and its role for many cellular properties.

Results and discussion

Singlish-interface hubs have a higher propensity for disorder, whereas multi-interface hubs have the same propensity as normal proteins

It has been pointed out before that hubs, that is, proteins with a large number of interaction partners, have a higher average number of disordered residues (Dunker ; Haynes ; Patil and Nakamura, 2006; Singh ). This result may be surprising, as one might assume that interactions would constrain the protein towards ordered regions. Indeed, a recent study has disagreed with the previous finding (Schnell ). Here, we seek to clarify this result by putting it in the context of structural interaction networks. Surprisingly, we find that in the Structural Interaction Network (SIN v2.0) (Kim ), singlish-interface hubs have a much higher fraction of disordered residues than multi-interface hubs (Figure 1A). The reason for the higher disorder of singlish-interface versus multi-interface hubs seems obvious: multi-interface hubs tend to be much more constrained than singlish-interface hubs. Hence, we expect multi-interface hubs to have a significantly reduced level of disorder than non-hub proteins, whereas singlish-interface hubs would be at approximately the same level. However, when we compare both types of hubs to all other proteins, we find that multi-interface hubs have about the same propensity for disorder as other proteins, whereas singlish-interface hubs have a much higher propensity than other proteins (Figure 1B–D). Hence, the difference in degree of disorder between the two types of hubs is unlikely to be the result of structural constraints on multi-interface hubs, as the other proteins would also have a similar absence of these constraints.

Figure 1

(A) Disorder in singlish-interface hubs versus multi-interface hubs. (Wilcoxon ranksum test, P=1.0e−8). (B) Distribution of disordered residues in the yeast proteins. As can be seen, most yeast proteins have a relatively low level of disorder; however, almost all have some fraction of disordered residues. (C) Distribution of disorder in multi-interface hubs. The distribution does not significantly deviate from the distribution of the yeast proteome (Kolmogorov-Smirnov test, P-value=0.11). In other words, multi-interface hubs do not have different levels of disorder than normal proteins. (D) Distribution of disorder in singlish-interface hubs. Singlish-interface hubs show a different distribution in terms of disorder (Kolmogorov–Smirnov test, P=2.0e−6).

Disordered regions in proteins tend to be under less evolutionary constraints contributing to the faster evolutionary rate of singlish-interface hubs

Previous studies have found that singlish-interface hubs have a significantly higher evolutionary rate than multi-interface hubs, presumably due to stronger constraints of the multiple interfaces (Kim ). However, other studies have suggested that this difference is due to a difference in protein abundance (Batada ). We hypothesized that the higher level of disorder would be related to this higher evolutionary rate. Indeed, it has been suggested that disordered proteins evolve faster than structured ones (Brown ). We find here that in a genome-wide analysis, disordered proteins have a significantly higher evolutionary rate than structured proteins (Figure 2A and B). As disordered proteins also tend to be expressed at a lower rate than structured ones (Supplementary Table S3), the causality is unclear. Hence, we looked at the evolutionary rate on a residue-by-residue basis, independent of any bias at the gene level. We find that disordered regions in proteins tend to evolve much faster than the other regions (Supplementary Table S4). Although structural factors only partly determine the evolutionary rate of proteins (Bloom ), a difference in disorder is likely to be a contributing factor.

Figure 2

dN/dS ratio of ordered and disordered proteins, and hubs. (A) All yeast proteins, split by order/disorder (Wilcoxon rank sum test, P-value <2.2e−16); (B) Hubs only (Wilcoxon rank sum test, P=1.5e−6).

Binding interfaces are structured

Disordered regions have been implicated in mediating promiscuous binding (Dunker ; Patil and Nakamura, 2006), thus enabling a protein to functionally bind to many diverse interacting partners. Also, singlish-interface hubs are known to be promiscuous binders and their interfaces presumably interact with many different partners. Hence, it seems reasonable to assume that the heightened level of disorder in singlish-interface hubs is due to their interfaces being involved in promiscuous binding (Singh ). Therefore, their binding interfaces should be highly disordered. However, when we examine the binding interfaces of singlish-interface hubs, we find them to be largely structured. Moreover, we do not find a significant difference in level of disorder between interfaces of singlish-interface and multi-interface hubs (Figure 3A).

Figure 3

(A) Disorder in the interface regions of singlish- and multi-interface hubs. (Wilcoxon ranksum test, P=0.4). (B) Disorder of the binding partner of singlish- and multi-interface hubs (Wilcoxon ranksum test, P=4.5e−5). BIOGRID data are reported in the boxplot. Similar results were found for Kim and Batada data sets (Supplementary Figure S5). (C) Schematic representation of disorder in singlish-interface hubs. Singlish-interface hubs have large portions of disordered regions (painted in gray). However, the interface is itself is highly structured (painted in black). One reason for the disorder in the bulk of the protein is the fact that singlish-interface hubs often are targeted by kinases. On the other hand, they tend to be kinases themselves and target disordered regions in other proteins.

This leaves us with two questions: (1) with structured interfaces, how is the promiscuous binding of singlish-interface hubs mediated? (2) What leads to their higher level of disorder, if not promiscuous binding at the binding interface?

Higher disorder in interacting partners of singlish-interface hubs

We first turn to the question of how the binding promiscuity of singlish-interface hubs is mediated. We hypothesized that if the interface of the singlish-interface hub itself is structured, perhaps the binding partners would be disordered, thus leading to promiscuous binding. This case of a disordered-structured promiscuous interaction has been described recently (Dunker ). Indeed, when examining the binding partners of singlish-interface hubs for disorder, we find that they are significantly more disordered than the binding partners of multi-interface hubs, as well as more disordered than other proteins (Figure 3B). Hence, promiscuous binding is partly mediated by disorder, but not in the interface in the singlish-interface hub itself, rather in the interacting partners.

Enrichment of disordered regions in singlish-interface hubs can be rationalized by their cascading nature as is illustrated in their involvement in signaling pathways

We hypothesized that the higher propensity of disorder in interacting partners of singlish-interface hubs may be related to their own higher level of intrinsic disorder. That is, if singlish-interface hubs had a tendency to interact with each other in a cascade fashion, it would lead to a separate region in the singlish-interface hub: a highly structured binding interface (that binds disordered regions in other proteins) and a disordered region, which in turn is bound by other singlish-interface hubs (Figure 3C). A recent study listed a number of examples of proteins with just this layout (Xie ). We find here for the particular case of singlish-interface hubs, a higher tendency to interact with each other (on average, 68% of singlish-interface hub partners are singlish-interface hubs). This cascading property is well illustrated in signaling pathways. Signaling pathways are thought to have evolved through a mix-and-match principle, consistent with the cascading nature that we observed in singlish-interface hubs (Pawson and Nash, 2003). That is, repeated duplications of these signaling genes must have occurred during evolution. Furthermore, it is known that signaling pathways tend to be enriched in disordered proteins (Iakoucheva , 2004). We find here that disordered proteins in particular have significantly higher numbers of paralogs than other proteins, suggesting that they have been duplicated more often (Supplementary Table S2; Supplementary Figure S4), and are perhaps less dosage sensitive. Kinases serve as a good illustrative example of signaling pathways. Indeed, there is a significant enrichment for protein kinases among singlish-interface hubs: about 34% of singlish-interface hubs are kinases (hypergeometric test, P-value=1e−33). Furthermore, it is known that the binding sites of both protein kinases and modular protein domains tend to lie in disordered regions (Iakoucheva ; Beltrao and Serrano, 2005). When checking for the likely targets of protein kinases, we find a significant enrichment in singlish-interface hubs (Table I). Likewise, and in agreement with previous results, we find that disordered proteins are much more likely to be kinase targets (Supplementary Table S1). Hence, for these proteins, some of the heightened level of disorder may be due to the fact that they present kinase target sites.

Table 1

Kinase targets

	Hubs
	Multi-interface	Singlish-interface
Non-kinase targets	165	56
Kinase targets	54	43

Contingency table of kinase targets versus hub interface. Singlish-interface hubs are enriched for being kinase targets (Fisher's exact test, P=0.001).

Furthermore, the concept of distal docking motifs for kinase targeting has recently been proposed (Remenyi ; Ubersax and Ferrell, 2007). This notion fits in very well with our results. In the simplest case, a kinase has a structured catalytic region and a second disordered region, which could harbor a distal docking motif.

Implications for different types of hubs in other networks

A related concept to singlish-interface and multi-interface hubs is the notion of party and date hubs (Han ), and it has been shown that there is some correspondence of the two (Kim ). Indeed, we find, consistent with earlier results (Ekman ; Singh ), that date hubs tend to have a higher degree of disorder than party hubs (in two versions of the FYI (Bertin ), Supplementary Figure S2a–b). However, there has been some controversy about the notion of date and party hubs (Batada , 2007) and potential biases in different data sets. Indeed, when examining the date-party hubs as defined by Batada and co-workers, we do not find a difference in the level of disorder (Supplementary Figure S2c). From this, one may conclude that the differences in disorder we observed are strongly dependent on data set choice and gene expression data sets tend to be confounded by noise. Hence, we examined the HCI data set by Batada and co-workers more closely and used an approximate inference of which hubs would be singlish-interface and which would be multi-interface in their data set (see Materials and methods). We believe that this inference of number of interfaces may be somewhat more robust, as it is related to a real biophysical property of proteins. Now we observe a significant difference in disorder between the two hub classes (Supplementary Figure S3). In summary, we find evidence for the notion that some hubs have few binding interfaces (hence interact with their partners at different times), whereas others have many and that both groups have distinct properties, such as a different level of disorder. This suggests that the notion of date and party hubs, since related, also reflects two distinct groups of proteins.

Conclusions

We have presented evidence here that intrinsic disorder is an important feature in protein networks. Specifically, it further distinguishes two types of hubs, multi- and singlish-interface, and is important in mediating promiscuous binding. However, the disordered regions do not seem to be enriched at the interface regions of singlish-interface hubs, but are rather enriched in their binding targets, presumably due to their central role in signaling pathways. Furthermore, the feature of protein disorder brings further evidence to the difference in evolutionary constraints of protein hubs.

Materials and methods

A number of different sources were utilized in this study. Hereafter, a description of the data sets and the analysis methods is reported.

Disorder prediction

We used DISOPRED (Ward ) to obtain disorder predictions of 6714 ORFs of Saccharomyces cerevisiae (including many dubious ORFs). This software tool provides both a score and a disorder classification for each residue. DISOPRED is among the top-ranking disorder prediction tools evaluated at the ‘Critical Assessment of Techniques for Protein Structure Prediction (CASP) conference (Moult ). The percentage of disordered residues is computed by dividing the number of disordered residues by ORF length. ORFs with a percentage of disordered residues greater than 50% were considered disordered. Similarly, we computed the percentage of disorder of interacting interfaces by dividing the number of disordered residues in the interface by the interface length.

Structural interaction network version 2

The definition of singlish- and multi-interface hubs is reported by Kim . We used an updated version of the SIN (SIN version 2.0). Among 316 hubs, 98 are singlish-interface and 218 are multi-interface hubs.

Party-hubs and date-hubs

Information about party- and date- hubs derives from three data sets: Han , Bertin , and Batada . In Han et als' data set, 108 party- and 91 date-hubs are included. In Bertin et als' data set, there are 306 date- and 240 party-hubs. Concerning the data set by Batada and co-workers, we determined party- and date-hubs by first selecting the ORFs with more than 10 interacting partners. Then, party (date) have an average correlation with their corresponding interacting partners greater than (less than) 0.25. This resulted in 175 date- and 33 party-hubs. Coexpression correlation was computed based on the compendium data by Hughes .

Pfam interacting domains

Pfam interacting domains were obtained from PFAM repository (Bateman ). To analyze the disorder of interfaces, the working hypothesis is that interacting domains confer binding capability to protein regions. The following cutoff values were used for domain assignment: (1) e-value of alignment <1e−4; (2) matched sequence length >80% of domain length; (3) domain length >12 residues. When using these constraints, we have 1342 ORFs with at least one interacting domain. In addition, this data set was employed to infer which of date- and party-hubs by Batada and co-workers are multi- or singlish-interface hubs. In this case, more stringent criteria to assign a domain to an ORF were used: (1) e-value of alignment <1e−7; (2) matched sequence length >95% pfam domain length; (3) domain length >5 residues. Accordingly, 1738 ORFs have at least one pfam domain: divided in 1441 singlish-domain ORFs and 327 multi-domain ORFs. Among those 1738, we only consider hubs (defined as having more than 10 interacting partners); resulting in 73 singlish-domain and 24 multi-domain hubs.

Kinase target data

We used the phosphorylome data set (Ptacek ) to obtain the list of kinase interaction partners. It contains 1325 ORFs known as targets for kinases.

Interaction data

Interaction data derives from several sources: BIOGRID (Stark ), Batada , and Kim . Each data set provides a list with the interacting ORFs. Considering BIOGRID, we included interactions determined by Affinity Capture-MS, Affinity Capture-RNA, Affinity Capture-Western, biochemical activity, co-crystal structure, Far Western, FRET, Protein-peptide, Protein-RNA, Reconstituted Complex, and Two-hybrid. Above-mentioned sources contain 61 634, 28 915, and 4080 interactions, respectively. We computed the average disorder of the interacting partners for each hub and assessed whether a difference between the partners of singlish- and multi-interface hubs is present by means of the Wilcoxon rank sum test. As singlish-interface hubs have other singlish-interface hubs as interacting partners, which are more disordered, we therefore repeated the same analysis by excluding other singlish-interface hubs partners. The difference between multi- and singlish-interface hubs partners is still significant (Supplementary Figure S5). Biases in the interaction network may affect our results. Indeed, the SIN is smaller than other interaction networks, and as it is based on proteins with solved crystal structures, it may be depleted in disordered proteins. However, we find contrasting evidence: the average percentage of disordered residues in the SIN is about the same as the genomic average: 26% (25% is the genomic average—Wilcoxon rank sum test, P=0.08 (Supplementary Figure S1).

Orthologs/paralogs

Orthologs and paralogs information was computed from the Clusters of Orthologous Groups (COGs) (Tatusov ). Cluster information was used to determine the number of paralogs for each ORF.

Evolutionary rates

Sequence alignment between S. cervisiae and S. bayanus was performed through BLAST. Each residue is then labeled as mutated or non-mutated. Disorder analysis was then computed residue-by-residue.

Computational analysis

We used R 2.5 to perform the statistical analysis (R Development Core Team, 2007). All data sets used and the updated version of the SIN along with detailed description and statistics are available at http://sin.gersteinlab.org. Supplementary Information 1 Supplementary Information 2 Supplementary Information 3 Supplementary Information 4

38 in total

1. Functional discovery via a compendium of expression profiles.

Authors: T R Hughes; M J Marton; A R Jones; C J Roberts; R Stoughton; C D Armour; H A Bennett; E Coffey; H Dai; Y D He; M J Kidd; A M King; M R Meyer; D Slade; P Y Lum; S B Stepaniants; D D Shoemaker; D Gachotte; K Chakraburtty; J Simon; M Bard; S H Friend
Journal: Cell Date: 2000-07-07 Impact factor: 41.582

2. Intrinsic disorder and protein function.

Authors: A Keith Dunker; Celeste J Brown; J David Lawson; Lilia M Iakoucheva; Zoran Obradović
Journal: Biochemistry Date: 2002-05-28 Impact factor: 3.162

Review 3. Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm.

Authors: P E Wright; H J Dyson
Journal: J Mol Biol Date: 1999-10-22 Impact factor: 5.469

4. Structural determinants of the rate of protein evolution in yeast.

Authors: Jesse D Bloom; D Allan Drummond; Frances H Arnold; Claus O Wilke
Journal: Mol Biol Evol Date: 2006-06-16 Impact factor: 16.240

5. Disordered domains and high surface charge confer hubs with the ability to interact with multiple proteins in interaction networks.

Authors: Ashwini Patil; Haruki Nakamura
Journal: FEBS Lett Date: 2006-03-10 Impact factor: 4.124

6. The Pfam protein families database.

Authors: Alex Bateman; Ewan Birney; Lorenzo Cerruti; Richard Durbin; Laurence Etwiller; Sean R Eddy; Sam Griffiths-Jones; Kevin L Howe; Mhairi Marshall; Erik L L Sonnhammer
Journal: Nucleic Acids Res Date: 2002-01-01 Impact factor: 16.971

7. BioGRID: a general repository for interaction datasets.

Authors: Chris Stark; Bobby-Joe Breitkreutz; Teresa Reguly; Lorrie Boucher; Ashton Breitkreutz; Mike Tyers
Journal: Nucleic Acids Res Date: 2006-01-01 Impact factor: 16.971

8. Intrinsic disorder is a common feature of hub proteins from four eukaryotic interactomes.

Authors: Chad Haynes; Christopher J Oldfield; Fei Ji; Niels Klitgord; Michael E Cusick; Predrag Radivojac; Vladimir N Uversky; Marc Vidal; Lilia M Iakoucheva
Journal: PLoS Comput Biol Date: 2006-06-23 Impact factor: 4.475

9. What properties characterize the hub proteins of the protein-protein interaction network of Saccharomyces cerevisiae?

Authors: Diana Ekman; Sara Light; Asa K Björklund; Arne Elofsson
Journal: Genome Biol Date: 2006 Impact factor: 13.583

10. Comparative genomics and disorder prediction identify biologically relevant SH3 protein interactions.

Authors: Pedro Beltrao; Luis Serrano
Journal: PLoS Comput Biol Date: 2005-08-12 Impact factor: 4.475

81 in total

1. Domain distribution and intrinsic disorder in hubs in the human protein-protein interaction network.

Authors: Ashwini Patil; Kengo Kinoshita; Haruki Nakamura
Journal: Protein Sci Date: 2010-08 Impact factor: 6.725

Review 2. Nucleolar targeting: the hub of the matter.

Authors: Edward Emmott; Julian A Hiscox
Journal: EMBO Rep Date: 2009-02-20 Impact factor: 8.807

3. System-wide analysis reveals intrinsically disordered proteins are prone to ubiquitylation after misfolding stress.

Authors: Alex H M Ng; Nancy N Fang; Sophie A Comyn; Jörg Gsponer; Thibault Mayor
Journal: Mol Cell Proteomics Date: 2013-05-28 Impact factor: 5.911

Review 4. Reads meet rotamers: structural biology in the age of deep sequencing.

Authors: Anurag Sethi; Declan Clarke; Jieming Chen; Sushant Kumar; Timur R Galeev; Lynne Regan; Mark Gerstein
Journal: Curr Opin Struct Biol Date: 2015-12-01 Impact factor: 6.809

5. Intrinsic protein disorder in human pathways.

Authors: Jessica H Fong; Benjamin A Shoemaker; Anna R Panchenko
Journal: Mol Biosyst Date: 2011-10-20

6. Structural principles within the human-virus protein-protein interaction network.

Authors: Eric A Franzosa; Yu Xia
Journal: Proc Natl Acad Sci U S A Date: 2011-06-16 Impact factor: 11.205

7. Integration of protein motions with molecular networks reveals different mechanisms for permanent and transient interactions.

Authors: Nitin Bhardwaj; Alexej Abyzov; Declan Clarke; Chong Shou; Mark B Gerstein
Journal: Protein Sci Date: 2011-09-15 Impact factor: 6.725

8. Connectedness of PPI network neighborhoods identifies regulatory hub proteins.

Authors: Andrew D Fox; Benjamin J Hescott; Anselm C Blumer; Donna K Slonim
Journal: Bioinformatics Date: 2011-03-02 Impact factor: 6.937

9. Differential occurrence of protein intrinsic disorder in the cytoplasmic signaling domains of cell receptors.

Authors: Alexander B Sigalov; Vladimir N Uversky
Journal: Self Nonself Date: 2011-01-01

10. Construction and application of a protein interaction map for white spot syndrome virus (WSSV).

Authors: Pakkakul Sangsuriya; Jiun-Yan Huang; Yu-Fei Chu; Kornsunee Phiwsaiya; Pimlapas Leekitcharoenphon; Watcharachai Meemetta; Saengchan Senapin; Wei-Pang Huang; Boonsirm Withyachumnarnkul; Timothy W Flegel; Chu-Fang Lo
Journal: Mol Cell Proteomics Date: 2013-11-11 Impact factor: 5.911