| Literature DB >> 23320073 |
Åsa Pérez-Bercoff1, Corey M Hudson, Gavin C Conant.
Abstract
Physical interactions between proteins mediate a variety of biological functions, including signal transduction, physical structuring of the cell and regulation. While extensive catalogs of such interactions are known from model organisms, their evolutionary histories are difficult to study given the lack of interaction data from phylogenetic outgroups. Using phylogenomic approaches, we infer a upper bound on the time of origin for a large set of human protein-protein interactions, showing that most such interactions appear relatively ancient, dating no later than the radiation of placental mammals. By analyzing paired alignments of orthologous and putatively interacting protein-coding genes from eight mammals, we find evidence for weak but significant co-evolution, as measured by relative selective constraint, between pairs of genes with interacting proteins. However, we find no strong evidence for shared instances of directional selection within an interacting pair. Finally, we use a network approach to show that the distribution of selective constraint across the protein interaction network is non-random, with a clear tendency for interacting proteins to share similar selective constraints. Collectively, the results suggest that, on the whole, protein interactions in mammals are under selective constraint, presumably due to their functional roles.Entities:
Mesh:
Year: 2013 PMID: 23320073 PMCID: PMC3539715 DOI: 10.1371/journal.pone.0052581
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1PPI presence and absence at the different nodes in the rooted eutherian phylogenetic tree. A)
At each node, we have shown the predicted percentage of human PPIs present at that node (necessarily 100% at the human tip). The percentages at the other seven tip nodes were inferred by the presence or absence of the orthologs of the two human proteins making up the PPI (Methods). We then inferred the states of the internal nodes under the assumption that a given PPI ortholog pair could appear only once in the phylogeny (Methods). The topology was visualized using FigTree [61]. Branch lengths are the mean Ks value (e.g., number of synonymous substitutions per synonymous site) found across the genes surveyed for that branch of the tree (See Methods). The five colored branches indicate potential origin points for a PPI under our limited parsimony model (Methods), while the two gray branches were used to estimate the rate of PPI loss. The dashed branches indicate the fact the Ks values could not be distinguished for these two branches because the models used produce unrooted trees. B) There is an association between the age of the branch along which a PPI appears (x-axis; estimated via Ks above) and the average interaction degree of the proteins that make up that interaction (y-axis). Note that the blue distance was estimated as one-half the Ks distance between the rodent-primate and horse-dog-cow clade in the unrooted topology of (A). See Methods for details.
Coevolution between PPI partners detected using correlated changes in selective constraint.
| Dataset/ω cutoff | Clade removed |
|
| Mean Spearman’s correlation (Real data) | Mean of means (Spearman’scorrelation,1000simulations) | Difference |
| Full data set: 0≤ω<∞ | None | 7730 |
| 0.131 | 0.122 | 0.009 |
| 0≤ω<5 | None | 7727 | <0.001 | 0.132 | 0.122 | 0.009 |
| 0≤ω<5 | Human | 7705 | <0.001 | 0.123 | 0.110 | 0.013 |
| 0≤ω<5 | Chimpanzee | 7668 | <0.001 | 0.102 | 0.099 | 0.003 |
| 0≤ω<5 | Macaque | 7303 | <0.001 | 0.126 | 0.108 | 0.018 |
| 0≤ω<5 | Mouse | 7173 | 0.007 | 0.132 | 0.128 | 0.004 |
| 0≤ω<5 | Rat | 7132 | 0.003 | 0.137 | 0.132 | 0.005 |
| 0≤ω<5 | Horse | 6937 | 0.001 | 0.139 | 0.131 | 0.008 |
| 0≤ω<5 | Dog | 6930 | 0.005 | 0.135 | 0.128 | 0.007 |
| 0≤ω<5 | Cow | 6785 | 0.011 | 0.133 | 0.127 | 0.006 |
| 0≤ω<5 | Human/Chimp | 7563 | <0.001 | 0.095 | 0.084 | 0.011 |
| 0≤ω<5 | Primates | 6113 | <0.001 | 0.070 | 0.054 | 0.016 |
| 0≤ω<5 | Rodents | 5123 | 0.091 | 0.106 | 0.111 | −0.005 |
| 0≤ω<5 | Horse/Dog | 5893 | 0.061 | 0.141 | 0.138 | 0.003 |
| 0≤ω<5 | Horse/Dog/Cow | 3456 | 0.421 | 0.165 | 0.169 | −0.004 |
Values of branch-wise selective constraint (ω) allowed in the computation of Spearman’s correlation between these ω values between paired branches for two proteins with a known PPI in humans (Methods).
Values of ω from the indicated clades were removed before the calculation of the Spearman’s correlation.
We required at least 6 common branches between the two orthologous genes trees for the two interacting proteins: the column indicates the number of PPIs meeting this requirement.
P-value of the hypothesis test that the real PPI pairs had a higher mean Spearman’s correlation than would be expected, given the distribution of correlations seen from 1000 simulations of the same number of pseudo-PPI pairs drawn from non-interacting proteins (Methods).
Mean of the mean correlation seen from 1000 simulations, each consisting of the same number of pseudo-PPIs from c.
Over- and under-represented GO terms of genes present at least once in a primate-specific PPI.
| Class | ID | GO term | #Obs | #Exp |
| Fold excess |
| Biological process | 0006139 | nucleobase-containing compound metabolic process | 518 | 467.7 | 4.0×10−2 | 1.21 |
| Biological process | 0007154 | cell communication | 692 | 528.8 | 9.6×10−21 | 1.51 |
| Biological process | 0007275 | multicellular organismal development | 403 | 353.5 | 1.6×10−2 | 1.26 |
| Biological process | 0008219 | cell death | 230 | 149.4 | 4.1×10−12 | 1.89 |
| Biological process | 0009987 | cellular process | 989 | 916.3 | 3.5×10−4 | 1.17 |
| Biological process | 0030154 | cell differentiation | 256 | 193.7 | 4.4×10−6 | 1.53 |
| Biological process | 0032501 | multicellular organismal process | 246 | 207.9 | 3.3×10−2 | 1.32 |
| Biological process | 0043170 | macromolecule metabolic process | 893 | 781.6 | 4.2×10−9 | 1.26 |
| Biological process | 0050789 | regulation of biological process | 1011 | 865.8 | 8.4×10−16 | 1.30 |
| Biological process | 0050896 | response to stimulus | 385 | 314.3 | 1.4×10−5 | 1.39 |
| Biological process | 0051704 | multi-organism process | 119 | 79.6 | 2.7×10−5 | 1.81 |
| Molecular function | 0004871 | signal transduction activity | 137 | 74.1 | 4.9×10−13 | 2.39 |
| Molecular function | 0005515 | protein binding | 1264 | 1065.8 | 1.6×10−33 | 1.28 |
| Molecular function | 0016301 | kinase activity | 183 | 116.3 | 3.0×10−10 | 1.90 |
| Molecular function | 0016491 | oxidoreductase activity | 27 | 54.6 | 1.3×10−4 | 0.45 |
| Molecular function | 0016740 | transferase activity | 199 | 163.9 | 3.9×10−2 | 1.33 |
Observed instances of the GO term. 1675 genes present in primate PPIs vs 7201 genes never observed in primate PPIs.
Expected number of occurrences among an randomly-selected set of genes of the same size.
P-values for the test of the hypothesis of no difference between the observed and expected number of occurrences of the term after a Bonferonni multiple-test correction.
Term was under-represented among the primate-specific PPIs.
Figure 2Differences between primate-specific and phylogenetically-distributed interactions. A)
Gene sets used in the GO analyses of primate-specific protein interactions. There are 8876 human genes having at least one interaction (for a total of 32,916 PPIs). Among those genes, 1502 interactions (encoded by 1675 genes) are found only in primates. Of those 1675 genes, 1,521 are also involved in other, nonprimate-specific interactions, and 154 are only involved in primate specific interactions. B) Genes involved in primate-specific interactions have, on average, more total interactions (i.e., the genes involved in these interactions tend to have a higher degree k). The distribution of the difference in degree (k) for each gene in a pair of interaction proteins was compared (here referred to as ‘absolute degree difference’, Δk; x-axis). In black are the primate-specific interactions (primatePPIs) while red (dashed-line) shows the remainder of the interactions.
Connectivity statistics of genes involved in primate PPIs vs genes part of nonprimate PPIs.
| Measure | Primate genes | Nonprimate genes | Primate-specific genes | All other genes |
| kmin | 1 | 1 | 1 | 1 |
| kmax | 240 | 110 | 31 | 240 |
| kmean | 18.6 | 5.1 | 1.8 | 7.7 |
|
| 2×10−16 | 2×10−16 | ||
Set of genes involved only in primate-specific interactions.
All genes not in (a).
Wilcoxon test.
Figure 3Paired cases of relaxed selective constraints for PPI pairs.
For each clade in Figure 1, we plot the number of cases where both members have either ρ>1.0 (A) or >0.5 (B). P-values are shown for the test of the hypothesis that there are more such shared cases of relaxed constraint than would be expected by chance (χ2 test, Methods). Cases where no P-value is shown had too few observations of ρ>5 for valid statistical conclusions to be drawn.
Over- and under-represented GO terms of genes present in PPIs where proteins in the protein pair have ω>0.5 for both branches vs remaining 4506 genes.
| Class | ID | GO term | #Obs | #Exp |
| Fold excess |
| Biological process | 0008219 | cell death | 75 | 45.6 | 9.9×10−5 | 1.81 |
| Biological process | 0050789 | regulation of biological process | 289 | 257.1 | 3.0×10−2 | 1.15 |
| Biological process | 0050896 | response to stimulus | 142 | 86.3 | 5.3×10−10 | 1.81 |
| Biological process | 0051704 | multi-organism process | 39 | 21.4 | 3.4×10−3 | 2.03 |
| Molecular function | 0004872 | receptor activity | 71 | 49.4 | 2.7×10−2 | 1.55 |
| Molecular function | 0005515 | protein binding | 384 | 336.5 | 8.0×10−6 | 1.18 |
Observed instances of the GO term. 524 genes with ω>0.5 for both branches vs remaining 4506 genes (of 5030 genes in total from 12472 PPIs for which mirrortrees could be constructed with reliable ML scores).
Expected number of occurrences among an randomly-selected set of genes of the same size.
Uncorrected P-value for the test of the hypothesis of no difference between the observed and expected number of occurrences of the term.