Literature DB >> 31297528

Molecular Chaperones Accelerate the Evolution of Their Protein Clients in Yeast.

David Alvarez-Ponce1,2, José Aguilar-Rodríguez3,4, Mario A Fares2,5.   

Abstract

Protein stability is a major constraint on protein evolution. Molecular chaperones, also known as heat-shock proteins, can relax this constraint and promote protein evolution by diminishing the deleterious effect of mutations on protein stability and folding. This effect, however, has only been stablished for a few chaperones. Here, we use a comprehensive chaperone-protein interaction network to study the effect of all yeast chaperones on the evolution of their protein substrates, that is, their clients. In particular, we analyze how yeast chaperones affect the evolutionary rates of their clients at two very different evolutionary time scales. We first study the effect of chaperone-mediated folding on protein evolution over the evolutionary divergence of Saccharomyces cerevisiae and S. paradoxus. We then test whether yeast chaperones have left a similar signature on the patterns of standing genetic variation found in modern wild and domesticated strains of S. cerevisiae. We find that genes encoding chaperone clients have diverged faster than genes encoding non-client proteins when controlling for their number of protein-protein interactions. We also find that genes encoding client proteins have accumulated more intraspecific genetic diversity than those encoding non-client proteins. In a number of multivariate analyses, controlling by other well-known factors that affect protein evolution, we find that chaperone dependence explains the largest fraction of the observed variance in the rate of evolution at both evolutionary time scales. Chaperones affecting rates of protein evolution mostly belong to two major chaperone families: Hsp70s and Hsp90s. Our analyses show that protein chaperones, by virtue of their ability to buffer destabilizing mutations and their role in modulating protein genotype-phenotype maps, have a considerable accelerating effect on protein evolution.
© The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Entities:  

Keywords:  zzm321990 dzzm321990 N/dS; molecular chaperones; mutational robustness; protein evolution

Mesh:

Substances:

Year:  2019        PMID: 31297528      PMCID: PMC6735891          DOI: 10.1093/gbe/evz147

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Introduction

Proteins within the proteome of any organism evolve at very different rates: whereas some proteins remain largely unaltered during long evolutionary periods, others can undergo fast evolutionary changes (Zuckerkandl and Pauling 1965; Zuckerkandl 1976; Li et al. 1985). The reasons for this diversity in rates of protein evolution are still a subject of intense debate (Rocha 2006; Alvarez-Ponce 2014; Zhang and Yang 2015). A number of factors have been shown to affect rates of evolution, including gene expression levels (Pál et al. 2001; Drummond et al. 2005), expression breadth in multicellular organisms (Duret and Mouchiroud 2000; Wright et al. 2004; Zhang and Li 2004; Alvarez-Ponce and Fares 2012), essentiality (Hurst and Smith 1999; Jordan et al. 2002; Alvarez-Ponce et al. 2016; Aguilar-Rodríguez and Wagner 2018), duplicability (Nembaware et al. 2002; Yang et al. 2003; Pegueroles et al. 2013), and the number of protein–protein interactions (Fraser et al. 2002; Hahn and Kern 2005; Alvarez-Ponce and Fares 2012). However, a comprehensive understanding of which factors affect rates of protein evolution, their relative impacts on rates of evolution, and the molecular mechanisms underlying these impacts, is lacking. Molecular chaperones (Ellis 1987) help other proteins achieve their functional and 3D native conformations, prevent protein aggregation, and restore the native conformation of proteins destabilized by environmental perturbations (Hartl and Hayer-Hartl 2009; Hartl et al. 2011). As such, they can render neutral certain amino acid substitutions that would otherwise (in the absence of chaperones) be deleterious (or at least diminish their negative fitness effects) (Tokuriki and Tawfik 2009). Chaperones thus represent an extrinsic source of protein robustness: They can increase the tolerance of a protein phenotype (e.g., protein structure responsible for the protein function) against mutational insults. Therefore, chaperones can be not only a source of environmental robustness but also of mutational robustness (Jarosz et al. 2010; Lauring et al. 2013; Fares 2015; Payne and Wagner 2019). That is, chaperones can effectively buffer certain types of mutations in proteins, and thus are expected to contribute to the accumulation of genetic variation, and to increase the rates of evolution of their clients. This increased rate of protein evolution of the clients of certain chaperones has been detected at the genomic level in a number of studies. Comparative analysis of bacterial genomes shows that the GroEL/ES chaperonin system can increase the evolutionary rate of its client proteins: after controlling for confounding factors, proteins that are clients of the system evolve faster on average than those that are not clients (Bogumil and Dagan 2010; Williams and Fares 2010). The bacterial DnaK also accelerates the rate of evolution of its clients (Aguilar-Rodríguez et al. 2016; Kadibalban et al. 2016). In yeast, Hsp90 clients evolve faster than their non-client paralogs (Lachowiec et al. 2013), and distinct groups of proteins interacting with different chaperones evolve at different rates (Bogumil et al. 2012). In mammals, kinases with higher binding affinity to Hsp90 evolve faster than kinases with lower binding affinity (Lachowiec et al. 2015). It has also been shown that both co- and posttranslationally acting chaperones can promote nonconservative amino acid substitutions, more likely destabilizing mutations, in their clients (Pechmann and Frydman 2014). However, most studies so far have focused on individual chaperones and species, and the effect of most chaperones on protein evolution remains unknown. In this study, we evaluate the effect of all yeast protein chaperones on the evolution of their protein clients. We conducted a comprehensive analysis of the chaperone–client interaction network of 35 chaperones in yeast (Gong et al. 2009). This network was established with TAP-tag pulldown assays followed by both liquid chromatography tandem mass spectrometry (LC-MS/MS) and by matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF). We used this high-quality network to evaluate whether chaperone clients evolve faster in yeast, and also to measure the contribution of different chaperone families to this acceleration of the rate of protein evolution. We show that many chaperones accelerate not only the rates of evolution of their clients but also their levels of nonsynonymous polymorphism.

Materials and Methods

Rates of Protein Evolution

The S. cerevisiae and S. paradoxus protein and coding (CDS) sequences were obtained from the Saccharomyces Genome Database (Cherry et al. 2012). Each S. cerevisiae protein sequence was used as query in a BLASTP search (E value cutoff = 10−10) against the S. paradoxus proteome. Similarly, each S. paradoxus protein was used in a BLASTP search against the S. cerevisiae proteome. Pairs of best reciprocal hits were considered to be encoded by pairs of orthologs. For each pair of orthologs, protein sequences were aligned using ProbCons (Do et al. 2005), and the resulting alignments were used to guide the alignment of the corresponding CDSs. PAML version 4.4d (codeml program, M0 model; Yang 2007) was used to estimate dN, dS, and dN/dS values.

Positive Selection Analyses

The Saccharomyces cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, and S. bayanus protein and CDS sequences were obtained from the Saccharomyces Genome Database (Cherry et al. 2012). Each S. cerevisiae protein sequence was used as query in a BLASTP search (E value cutoff = 10−10) against the S. paradoxus, S. mikatae, S. kudriavzevii, and S. bayanus proteomes. Similarly, each S. paradoxus, S. mikatae, S. kudriavzevii, and S. bayanus protein was used in a BLASTP search against the S. cerevisiae proteome. Pairs of best reciprocal hits were considered to be encoded by pairs of orthologs. Only genes with putative orthologs in S. paradoxus, S. mikatae, S. kudriavzevii, and S. bayanus were retained for analysis. For each groups of orthologs, protein sequences were aligned using ProbCons (Do et al. 2005), and the resulting alignments were used to guide the alignment of the corresponding CDSs. Alignments were filtered as in a previous study (Luisi et al. 2015). The filtered alignments were used in tests of positive selection using PAML version 4.4d (codeml program, M8 vs. M7 test; Yang et al. 2000). Twice the difference in the log-likelihood of both models was assumed to follow a χ2 distribution with two degrees of freedom. Genes with a P value <0.05 and a fraction of codons with dN/dS >1 were assumed to be under positive selection. All computations were run using three starting dN/dS values (0.04, 0.4, and 4) in order to alleviate the problem of local optima. The alignments corresponding to genes with signatures of positive selection were visualized using BioEdit version 7.2.5 in order to discard alignment or annotation errors.

Chaperone Client Data

Chaperone–client interaction data were obtained from Gong et al. (2009). Their study included 35 chaperones and 29 co-chaperones. For each chaperone, we obtained a list of clients from their supplementary table 2.

Additional Information

For each S. cerevisiae gene, the following information was gathered from different sources. The nonsynonymous to synonymous polymorphism ratio was obtained from Peter et al. (2018). For each gene, the average dN/dS across all pairs of genomes was used (YN00). We obtained gene expression data for S. cerevisiae grown in rich media (YPAD) at 30 °C to mid exponential phase, where gene expression levels are measured as number of RNA-seq reads per gene length (Nagalakshmi et al. 2008). The number of protein–protein interactions (degree centrality) was obtained from the BioGRID database, version v3.2.101. Only physical, nonredundant interactions among S. cerevisiae proteins were included in the analysis. Degrees were recomputed on a high-quality subnetwork, including those interactions determined by low-throughput studies or by more than one high-throughput study. A list of paralogs was obtained from Ensembl’s Biomart (Kinsella et al. 2011), and genes with at least one paralog were classified as duplicates. A list of genes essential for growth in rich glucose media was obtained from Giaever et al. (2002).

Statistical Analyses

Statistical analyses were conducted using the R package (R Core Team 2014). Partial correlation analyses were conducted using the “pcor.test” function (Kim 2015). We used the package “pls” to carry out the principal component regression analysis. We carried out base-10 logarithmic transformations of the continuous variables when such transformations led to a higher R2. If a continuous variable contained values equal to zero, we added a small constant (0.001) to all its values to allow its logarithmic transformation. We scaled the independent variables to zero mean and unit variance.

Results

Yeast Chaperone Clients Evolve Slower than Non-clients

We classified all Saccharomyces cerevisiae proteins into three classes: chaperones (n = 35), co-chaperones (n = 29), and others (n = 6,653), using the chaperone and co-chaperone list by Gong et al. (2009). The latter class was further classified into chaperone clients (those that interact with any of the chaperones according to the data set of Gong et al. 2009; n = 4,209) and non-clients (all remaining proteins, n = 2,444). For each S. cerevisiae gene, the most likely ortholog in S. paradoxus was identified using a best-reciprocal-hit approach (see Materials and Methods), and the rate of protein evolution was measured from the nonsynonymous to synonymous divergence ratio (dN/dS). These species diverged from a common ancestor ∼5–10 Ma ago (Dori-Bachash et al. 2011). Orthologs could be identified for 5,603 of the S. cerevisiae genes. Values of dN/dS >8 were removed, as they probably represent artifacts (ten genes were removed). The mean dN/dS value was 0.1553, and the median was 0.0970, consistent with prior results (e.g., Alvarez-Ponce et al. 2017). After applying these filters, a total of 3,958 clients and 1,574 non-clients were available for analysis. All remaining genes were excluded from further analyses. Clients exhibit substantially lower dN/dS values (median: 0.0930) than non-clients (median: 0.1149; Mann–Whitney U test, P value = 9.48 × 10−22; fig. 1 and table 1). They also exhibit lower dN and higher dS values (fig. 1 and table 1). Next, we considered whether the number of chaperones of which each protein is client correlates with its rate of evolution. Among the 3,958 genes that have an ortholog in S. paradoxus and are clients of at least one chaperone, dN/dS negatively correlates with the number of chaperones (Spearman’s rank correlation coefficient, ρ = −0.0784, P = 7.79 × 10−7). The number of chaperones also correlates with dN (ρ = −0.0596, P = 0.0002) and, to a lesser extent, with dS (ρ  =  0.0323, P = 0.0422).
. 1.

—Rates of evolution of yeast chaperone clients and non-clients. Outliers (those above the 90th and below the 10th percentiles) are not shown. Significance levels: *P < 0.05, **P < 0.001, and ***P < 10−5.

Table 1

Comparison between Yeast Chaperone Clients and Non-clients

Chaperone Clients
Non-clients
P value
n MeanMedian n MeanMedian
d N/dS3,9580.11650.09301,574 0.2563 0.1149 9.48 × 10−22***
d N 3,9580.04320.03551,574 0.0653 0.0411 5.66 × 10−12***
d S 3,958 0.3795 0.3817 1,5740.37220.36552.80 × 10−11***
Number of protein–protein interactions3,875 30.3130 16 1,26518.710783.62 × 10−53***
Expression level3,434 71.1133 23 1,18469.5845201.71 × 10−5***
Protein length3,958 553.5682 462 1,574327.51722693.10 × 10−138***

Note.—For each pair of clients versus non-client values, the higuest value is shown in bold face. P values correspond to the Mann–Whitney test.

**P < 10−5.

—Rates of evolution of yeast chaperone clients and non-clients. Outliers (those above the 90th and below the 10th percentiles) are not shown. Significance levels: *P < 0.05, **P < 0.001, and ***P < 10−5. Comparison between Yeast Chaperone Clients and Non-clients Note.—For each pair of clients versus non-client values, the higuest value is shown in bold face. P values correspond to the Mann–Whitney test. **P < 10−5. We next considered whether chaperone clients may be enriched in proteins encoded by genes under positive selection. For each S. cerevisiae gene, we identified its most likely orthologs in another four species of the genus Saccharomyces (S. paradoxus, Saccharomyces mikatae, Saccharomyces kudriavzevii, and Saccharomyces bayanus). Only genes with a putative ortholog in all species (n = 2,047) were included in this analysis. The M8 versus M7 test (Yang 2000) was used to identify signatures of positive selection (see Materials and Methods). Among chaperone clients, 19 genes (3.40%) were encoded by genes under positive selection. Among nonclients, 72 (4.84%) were encoded by genes with signatures of positive selection. The fraction of genes under positive selection was not significantly different between clients and non-clients (Fisher’s exact test, P = 0.0967).

The Low Rate of Evolution of Chaperone Clients Is Not Due to Their Expression Levels, Essentiality, or Duplicability

Rates of protein evolution are affected by a number of factors, including expression levels (Pál et al. 2001; Drummond et al. 2005), gene essentiality (Hurst and Smith 1999; Jordan et al. 2002; Alvarez-Ponce et al. 2016), gene duplicability (Nembaware et al. 2002; Yang et al. 2003; Pegueroles et al. 2013), and number of protein–protein interactions (Fraser et al. 2002; Hahn and Kern 2005; Alvarez-Ponce and Fares 2012) (for review, see Rocha 2006; Alvarez-Ponce 2014; Zhang and Yang 2015). Clients and non-clients differ in all these parameters (table 1), and thus it is conceivable that the observed differences in the rates of evolution of clients and non-clients (fig. 1 and table 1) might be a byproduct of differences in these factors. In order to discard this possibility, we conducted a number of controls. Expression level seems to be a major determinant of protein’s rates of evolution, with highly expressed genes tending to be more selectively constrained (Pál et al. 2001; Drummond et al. 2005, 2006). In agreement with prior results, we observed a negative correlation between expression levels and dN/dS (ρ = −0.4138, P = 1.73 × 10−190). Chaperone clients are more highly expressed than non-clients (median expression level for clients: 23; median expression level for nonclients: 20; Mann–Whitney test, P = 1.71 × 10−5). This raises the possibility that the lower rates of evolution of clients might be a byproduct of clients being more highly expressed. However, partial correlation analysis shows that the relationship between “chaperone dependence” (a dummy variable taking the value of 1 if the protein is client of at least one chaperone, and 0 otherwise) and dN/dS is independent of expression level (partial Spearman’s rank correlation coefficient, ρ = −0.0414, P = 0.0049). Furthermore, among chaperone clients, the partial correlation between dN/dS and number of chaperones controlling for expression level is significantly negative (ρ = −0.0643, P = 0.00016). Proteins encoded by essential genes tend to be more constrained than those encoded by nonessential genes (Hurst and Smith 1999; Alvarez-Ponce et al. 2016). Among the 3,958 chaperone clients with dN/dS information, 831 (i.e., 21%) are essential. Among the 1,574 non-clients, only 228 (14.5%) are essential. Thus, clients are enriched in essential genes (Fisher’s exact test, P < 10−6), which could potentially explain their low evolutionary rates. To discard this possibility, we analyzed essential and nonessential genes separately, and in both cases clients exhibited a lower dN/dS. Among essential genes, the median dN/dS was 0.0692 for clients and 0.0913 for non-clients (Mann–Whitney test, P = 0.0016). Among nonessential genes, the median dN/dS was 0.0990 for clients and 0.1179 for non-clients (Mann–Whitney test, P = 2.06 × 10−16). Proteins encoded by duplicated genes tend to evolve slower than those encoded by singleton genes (Nembaware et al. 2002; Yang et al. 2003), in spite of the fact that gene duplication transiently accelerates protein evolution (Han et al. 2009; Pegueroles et al. 2013). Among clients, 1,684 (42.54%) are encoded by duplicated genes, and among nonclients, 547 (34.75%) are encoded by duplicated genes; that is, clients are enriched in proteins encoded by duplicated genes (Fisher’s exact test, P < 10−6), which might account for their slow evolution. To discard this possibility, we analyzed singleton and duplicated genes separately. Among singletons, clients exhibit lower dN/dS values (median = 0.1048) than non-clients (median = 0.1462; Mann–Whitney U test, P = 9.45 × 10−27). Among the less numerous duplicates, clients also exhibited lower dN/dS values, but the differences were not significant (median for clients: 0.0752, median for non-clients: 0.0786, P = 0.3210). In addition, among clients, the number of chaperones significantly correlates with dN/dS, among both singletons (ρ = −0.0705, P = 0.0008) and duplicates (ρ = −0.0745, P = 0.0022). These results indicate that the lower rates of evolution of chaperone clients are not due to their enrichment in proteins encoded by duplicated genes.

Controlling for Number of Physical Interactions Reveals That Chaperone Dependence Accelerates Protein Evolution

The number of protein–protein interactions with which a protein interacts (degree centrality) negatively correlates with its rate of evolution (Fraser et al. 2002; Hahn and Kern 2005; Alvarez-Ponce and Fares 2012), a pattern that was also apparent in our data set (ρ = −0.2788, P = 2.14 × 10−92). This, together with the fact that chaperone clients tend to exhibit more protein–protein interactions (median = 16) than non-clients (median = 8; Mann–Whitney U test, P = 3.62 × 10−53), might account for the low rates of evolution of chaperone clients. Indeed, the partial correlation between dN/dS and chaperone dependence while controlling for degree is significantly positive (ρ  =  0.0507, P = 0.0003), as is the partial correlation between the dN/dS values of clients and their number of chaperones while controlling for degree (ρ  =  0.0181, P = 2.71 × 10−6). These results indicate that chaperones accelerate the rates of evolution of their clients. We repeated these analyses using degree values computed from a subset of protein–protein interactions of high quality (interactions identified either by low-throughput screens or by two or more high-throughput screens). This reduced the number of genes for which available network data were available from 5,140 to 4,011. The partial correlation between dN/dS and chaperone dependence while controlling for degree remains significantly positive (ρ  =  0.0405, P = 0.0104), while the correlation between dN/dS and the number of chaperones with which clients interact was not significant (ρ  =  0.0055, P = 0.7546). To further validate our results, we binned proteins into seven degree classes: 1–5 interactions (744 clients and 480 non-clients), 6–10 interactions (681 clients and 246 nonclients), 11–15 interactions (499 clients and 158 non-clients), 16–20 interactions (330 clients and 80 non-clients), 21–25 interactions (280 clients and 71 non-clients), 26–30 interactions (203 clients and 40 non-clients), and >30 interactions (1,138 clients and 190 non-clients). Within each of the classes, chaperone clients exhibited a higher median dN/dS than non-clients (fig. 2), with significant differences in the classes of degree 15–20 (one-tailed Mann–Whitney test, P = 4.30 × 10−5) and degree >30 (P = 0.0385). In addition, the observation that in all seven categories clients have a higher median dN/dS is not expected at random (binomial test, P = 0.0156).
. 2.

—Comparison of the rate of evolution of clients and non-clients with different numbers of protein-protein interactions. Clients are represented in gray and non-clients in white. Outliers (those above the 90th and below the 10th percentiles) are not shown.

—Comparison of the rate of evolution of clients and non-clients with different numbers of protein-protein interactions. Clients are represented in gray and non-clients in white. Outliers (those above the 90th and below the 10th percentiles) are not shown.

Multivariate Analyses Confirm the Accelerating Effect of Chaperones on the Evolution of Their Clients

We performed a multivariate regression analysis to study the relative influence of all the studied factors (chaperone dependence, expression level, number of protein–protein interactions, duplicability, and essentiality) simultaneously. We regressed dN/dS against the five biological factors, and found that all make a significant contribution to the regression and that the overall R2 is 0.219 (table 2). Chaperone dependence was the only factor with a positive coefficient, indicating that chaperone dependence increases protein evolutionary rates. Multivariate regression assumes that the predictor variables are statistically independent. To evaluate if our predictors intercorrelate, we used the variance inflation factor (VIF) to quantify the degree of collinearity. We found VIF values for each of the predictor variables that range from 1.03 to 1.26, which indicates that while collinearity is present in our model, it is rather low. Nevertheless, multivariate regression can produce spurious results in the presence of both collinearity and noise (Drummond et al. 2006), and our variables are affected by noisy measurements. Therefore, we also performed a principal component regression analysis, which is an established method to study the relative contributions of different determinants of protein evolutionary rates (Drummond et al. 2006), although it is not entirely insensitive to noise (Plotkin and Fraser 2007). Principal component regression finds new variables, called principal components, which are linear combinations of the original predictor variables, and then regresses the response variable against all of them. We performed principal component regression using the same predictor variables as above. Table 3 shows numerical data from the analysis, while figure 3 shows these data graphically.
Table 2

Multiple Linear Regression of Divergence Data

d N /d S d N d S
Chaperone dependence0.17***0.14***0.06***
Number of protein–protein interactions−0.16***−0.13***−0.01*
Expression level−0.34***−0.31***−0.09***
Duplicability−0.38***−0.34***−0.02*
Essentiality−0.35***−0.30***−0.01

Note.—Regression coefficients are shown.

P < 0.05 and ***P < 10−5.

Table 3

Results from the Principal Component Regression Analysis of Divergence Data

Principal Components
12345All
Percentage of explained variance in
dN/dS13.09***2.04***6.11***0.22***0.39***21.85
dN17.55***2.82***8.41***0.46***0.66***29.89
dS4.36***0.44***5.38***2.16***0.38***12.72
Percent contributions of each variable
 Chaperone dependence0.100.05 0.71 0.080.06
 Number of protein–protein interactions 0.42 0.020.000.02 0.54
 Expression level 0.22 0.02 0.27 0.37 0.12
 Duplicability0.00 0.74 0.020.190.05
 Essentiality 0.26 0.180.00 0.34 0.22

Note.—We indicate in bold the contributions of a predictor to a component when >20%.

***P < 10−5.

. 3.

—Principal component regression on (A) dN/dS, (B) dN, and (C) dS calculated using divergence data between Saccharomyces cerevisiae and S. paradoxus for 5,532 yeast genes. For each principal component, the height of the bar represents the percent of variance in the rate of evolution explained by the component. The relative contribution of each variable to a principal component is represented with different colors. Table 3 contains the numerical data used to draw this figure.

—Principal component regression on (A) dN/dS, (B) dN, and (C) dS calculated using divergence data between Saccharomyces cerevisiae and S. paradoxus for 5,532 yeast genes. For each principal component, the height of the bar represents the percent of variance in the rate of evolution explained by the component. The relative contribution of each variable to a principal component is represented with different colors. Table 3 contains the numerical data used to draw this figure. Multiple Linear Regression of Divergence Data Note.—Regression coefficients are shown. P < 0.05 and ***P < 10−5. Results from the Principal Component Regression Analysis of Divergence Data Note.—We indicate in bold the contributions of a predictor to a component when >20%. ***P < 10−5. For evolutionary rates measured as dN/dS, we found a principal component with a ∼70% contribution of chaperone dependence and ∼30% of expression level. This component explained a modest 6% of the variance with high significance (table 2 and fig. 3). Another significant principal component explains 13% of the variance. This component is mainly determined by the number of protein–protein interactions, essentiality, and expression level. A component explaining just ∼2% of the variance was mainly determined by duplicability. The other two significant components explained in combination <1% of the variance. In summary, we found that chaperone dependence was the biological factor explaining the largest fraction of the total variance in the rate of evolution measured as dN/dS (5.77%) (table 4). It explained a larger fraction of the total variance than expression level (4.72%), and similar to the fraction explained by the number of protein–protein interactions (5.75%). Similar results were observed for dN (tables 3 and 4; fig. 3). For dS, chaperone dependence was still the main factor explaining the total variance in the rate of evolution, with a contribution of 4.48%—still above that of expression level (3.27%) (table 4). Indeed, it was the main determinant (∼70%) of the principal component explaining the largest fraction of the variance (5.38%) (table 3).
Table 4

Total Variance Explained by Each Variable in the Principal Component Regression Analysis of Divergence Data

d N /d S d N d S
Chaperone dependence5.77%7.91%4.48%
Number of protein–protein interactions5.75%7.78%2.08%
Expression level4.72%6.47%3.27%
Duplicability1.68%2.36%0.86%
Essentiality3.93%5.37%2.03%
Total Variance Explained by Each Variable in the Principal Component Regression Analysis of Divergence Data Finally, we performed an analysis of covariance (ANCOVA), which is a category-based analysis in which we evaluated the effect of chaperone dependence on the rate of protein evolution measured as dN/dS while controlling for the effect of the most important predictors: number of protein–protein interactions, expression level, and essentiality. We used the principal component of these three variables (principal component 1 in table 3 and fig. 3) as the continuous variable in the ANCOVA. We found that chaperone clients evolve on average 23% faster than all proteins (P = 8.6 x 10−7) (fig. 4).
. 4.

—ANCOVA. Chaperone clients (gray points, continuous line) evolve 23% above the genome average rate (light points, dashed line) when considering divergence data between Saccharomyces cerevisiae and S. paradoxus.

—ANCOVA. Chaperone clients (gray points, continuous line) evolve 23% above the genome average rate (light points, dashed line) when considering divergence data between Saccharomyces cerevisiae and S. paradoxus.

Separate Analysis of the Clients of Individual Chaperones

Thus far, we have aggregated the clients of all chaperones into a single group. However, different chaperones may affect the rates of protein evolution in different ways. We thus considered the clients of each chaperone separately. For each chaperone, we compared the clients of the chaperone against the proteins that are not clients of any chaperone. We again found that in all 35 cases clients exhibit a lower median and average dN/dS, with significant differences in 32 cases (Mann–Whitney U test, P < 0.05; table 5). However, partial correlations between dN/dS and chaperone dependence using degree as controlling variable were positive in 23 cases (significantly positive in 13 cases) and negative in 12 cases (significantly negative in 0 cases). This approach has the limitations that some chaperones have very few known clients, and that clients of the chaperone of interest may also be clients of other chaperones.
Table 5

Comparison between the Rates of Evolution of Clients of Different Yeast Chaperones and Proteins That Are Not Clients of Any Chaperone

ClassChaperoneClients
Non-clients
Mann–Whitney
Partial Correlation
n MedianMean n MedianMean P value Q valueρ P value Q value
CCTsCct21200.08070.09531,574 0.1149 0.2563 1.64 × 10−6***2.61 × 10−6***0.02890.28220.4490
CCTsCct31190.08000.09931,574 0.1149 0.2563 3.41 × 10−5**4.26 × 10−5**0.03110.24810.4135
CCTsCct41580.08610.10511,574 0.1149 0.2563 5.52 × 10−5**6.66 × 10−5**0.05600.0345*0.1098
CCTsCct5340.06030.07691,574 0.1149 0.2563 0.0002**0.0002**−0.02250.41770.5629
CCTsCct6920.06890.08221,574 0.1149 0.2563 5.00 × 10−8***1.17 × 10−7***−0.01900.48500.5853
CCTsCct7400.04610.06111,574 0.1149 0.2563 1.49 × 10−7***2.90 × 10−7***−0.05420.05020.1255
CCTsCct81790.08480.10161,574 0.1149 0.2563 6.07 × 10−7***1.12 × 10−6***0.03900.13890.2749
CCTsTcp1460.05500.07761,574 0.1149 0.2563 7.91 × 10−6***1.15 × 10−5**−0.02410.38320.5588
Hsp70sEcm10640.06330.07991,574 0.1149 0.2563 2.08 × 10−6***3.17 × 10−6***−0.02150.43420.5629
Hsp70sKar2680.06940.10001,574 0.1149 0.2563 0.0002**0.0002**0.02740.31810.4841
Hsp70sLhs1780.07820.08371,574 0.1149 0.2563 1.28 × 10−6***2.13 × 10−6***0.00420.87810.8781
Hsp70sSsa12,3850.09030.10981,574 0.1149 0.2563 6.43 × 10−22***1.13 × 10−20***0.06060.0003**0.0013*
Hsp70sSsa21,8280.09070.11041,574 0.1149 0.2563 4.69 × 10−19***4.10 × 10−18***0.07423.86 × 10−5**0.0004**
Hsp70sSsa33040.09490.11291,574 0.1149 0.2563 2.70 × 10−5***3.50 × 10−5**0.09550.0002**0.0010*
Hsp70sSsa44360.09180.10881,574 0.1149 0.2563 6.75 × 10−8***1.39 × 10−7***0.09650.0001**0.0006**
Hsp70sSsb13,1090.09160.11481,574 0.1149 0.2563 1.51 × 10−21***1.76 × 10−20***0.06047.02 × 10−5**0.0005**
Hsp70sSsb21,1670.08810.11161,574 0.1149 0.2563 1.18 × 10−17***6.88 × 10−17***0.08334.01 × 10−5***0.0004**
Hsp70sSsc11910.06030.08271,574 0.1149 0.2563 5.22 × 10−15**2.61 × 10−14***−0.01650.52850.5967
Hsp70sSse11,8620.09130.10981,574 0.1149 0.2563 5.32 × 10−18***3.72 × 10−17***0.07423.45 × 10−5**0.0004**
Hsp70sSse22340.08120.09901,574 0.1149 0.2563 3.13 × 10−8***7.83 × 10−8***0.04090.11380.2489
Hsp70sSsq1910.06210.07661,574 0.1149 0.2563 1.82 × 10−9***5.79 × 10−9***−0.03180.24130.4135
Hsp70sSsz16360.09480.10991,574 0.1149 0.2563 5.16 × 10−9***1.51 × 10−8***0.10011.19 × 10−5**0.0004**
Hsp90sHsc824210.08100.09861,574 0.1149 0.2563 1.97 × 10−13***8.62 × 10−13***0.04820.0480*0.1255
Hsp90sHsp828320.08860.11571,574 0.1149 0.2563 2.58 × 10−13***1.00 × 10−12***0.06960.0015*0.0058*
Hsp100sHsp787730.07790.09581,574 0.1149 0.2563 4.45 × 10−24***1.56 × 10−22***0.03980.07300.1703
Hsp100sHsp1043580.08770.10791,574 0.1149 0.2563 5.36 × 10−8***1.17 × 10−7***0.07190.0037*0.0130*
SmallHsp31980.08480.09871,574 0.1149 0.2563 0.0004**0.0004**0.03190.23850.4135
SmallHsp3220.06350.06391,574 0.1149 0.2563 0.30450.3045−0.01000.72140.7651
SmallHsp3330.03630.04541,574 0.1149 0.2563 0.07390.0783−0.02210.43150.5629
SmallSno430.02610.06221,574 0.1149 0.2563 0.17230.1773−0.00610.82840.8528
OtherHsp12910.07310.09301,574 0.1149 0.2563 2.21 × 10−5**2.98 × 10−5**0.01800.50760.5922
OtherHsp26850.06510.08061,574 0.1149 0.2563 1.60 × 10−8***4.31 × 10−8***−0.02050.45230.5654
OtherHsp423610.08390.10441,574 0.1149 0.2563 8.57 × 10−10***3.00 × 10−9***0.04930.0468*0.1255
OtherHsp60950.07770.09171,574 0.1149 0.2563 1.59 × 10−5**2.23 × 10−5**0.01590.55730.6095
OtherMcx1410.05110.07181,574 0.1149 0.2563 1.27 × 10−6***2.13 × 10−6***−0.04070.14140.2749

Note.—For each pair of clients versus non-clients, the highest dN/dS values are shown in bold face. Partial correlations correspond to the Spearman correlation between chaperone dependence and dN/dS, controlling for number of protein–protein interactions. Q values were computed using the Benjamini–Hochberg approach (Benjamini and Hochberg 1995).

P < 0.05, **P < 0.001, and ***P < 10−5.

Comparison between the Rates of Evolution of Clients of Different Yeast Chaperones and Proteins That Are Not Clients of Any Chaperone Note.—For each pair of clients versus non-clients, the highest dN/dS values are shown in bold face. Partial correlations correspond to the Spearman correlation between chaperone dependence and dN/dS, controlling for number of protein–protein interactions. Q values were computed using the Benjamini–Hochberg approach (Benjamini and Hochberg 1995). P < 0.05, **P < 0.001, and ***P < 10−5.

Analysis of the Clients of Different Groups of Chaperones

We next grouped chaperones into five groups: small Hsps (Hsp31, Hsp32, Hsp33, and Sno4), Hsp70s (Kar2, Ssb1, Sse1, Sse2, Ssa1, Ssa2, Ssa3, Ssa4, Ssb2, Ecm10, Ssc1, Ssq1, Ssz1, and Lhs1), Hsp90s (Hsp82 and Hsc82), Hsp100s (Hsp78 and Hsp104), and CCTs (Tcp1, Cct4, Cct8, Cct2, Cct3, Cct5, Cct6, and Cct7), and investigated the rates of evolution of the clients of each group. Single-family chaperones (Hsp26, Hsp42, Hsp12, Mcx1, and Hsp60) were not included in this analysis. For each group of chaperones, we compared the rates of evolution of proteins that are clients of any of the chaperones of the group, against proteins that are not clients of any chaperone. In all five cases, clients had a significantly lower dN/dS. However, partial correlations between the dependence of each group and dN/dS controlling for degree were always positive, and significant for the three chaperone classes with more clients (Hsp70s, Hsp90s, and Hsp100s) (table 6). This approach has the limitation that clients of one group of chaperones may also be clients of chaperones outside that group.
Table 6

Comparison between the Rates of Evolution of Clients of Different Chaperone Families and Proteins That Are Not Clients of Any Chaperone

ClassClients
Non-clients
Mann–Whitney
Partial Correlation
n MedianMean n MedianMean P value Q valueρ P value Q value
CCTs6140.07940.09671,574 0.1149 0.2563 2.55 × 10−20***4.25 × 10−20***0.03490.13100.1638
Hsp70s3,7830.09320.11561,574 0.1149 0.2563 2.66 × 10−21***6.65 × 10−21***0.05500.0001**0.0005**
Hsp90s1,1010.08610.11151,574 0.1149 0.2563 1.27 × 10−17***1.59 × 10−17***0.06150.0028*0.0070*
Hsp100s1,0040.08240.10051,574 0.1149 0.2563 2.91 × 10−23***1.46 × 10−22***0.05370.0106*0.0176*
Small1040.08090.09661,574 0.1149 0.2563 0.0001**0.0001**0.02850.29260.2926

Note.—For each pair of clients versus non-clients, the highest dN/dS values are shown in bold face. Partial correlations correspond to the Spearman correlation between chaperone dependence and dN/dS, controlling for number of protein–protein interactions. Q values were computed using the Benjamini–Hochberg approach (Benjamini and Hochberg 1995).

P < 0.05, **P < 0.001, and ***P < 10−5.

Comparison between the Rates of Evolution of Clients of Different Chaperone Families and Proteins That Are Not Clients of Any Chaperone Note.—For each pair of clients versus non-clients, the highest dN/dS values are shown in bold face. Partial correlations correspond to the Spearman correlation between chaperone dependence and dN/dS, controlling for number of protein–protein interactions. Q values were computed using the Benjamini–Hochberg approach (Benjamini and Hochberg 1995). P < 0.05, **P < 0.001, and ***P < 10−5. Next, in order to tease apart the effects of the different chaperone groups on rates of protein evolution while controlling for possible confounding factors, we performed two different multivariable analyses. We first performed a multiple linear regression analysis regressing dN/dS against the four confounding biological factors we consider here (number of protein–protein interactions, expression level, essentiality, and duplicability), and dependence of the five chaperone families (Hsp70s, Hsp90s, Hsp100s, CTTs, and small Hsps). We found that among the chaperone families only Hsp70s and Hsp90s make a significant contribution to the regression and that the overall R2 is 0.220 (table 7). Hsp70s and Hsp90s dependence were the only factors with a positive coefficient, indicating that dependence on these two major chaperone groups increases protein evolutionary rates. The contribution of Hsp90s was lost when regressing dN or dS instead of dN/dS (table 7).
Table 7

Multiple Linear Regression of Different Chaperone Families

d N /d S d N d S
HSP70 dependence0.15***0.12***0.04**
HSP90 dependence0.10*0.060.02
HSP100 dependence0.040.050.05***
CTT dependence−0.03−0.04−0.01
SMALL dependence−0.100.020.03
Number of protein–protein interactions−0.16***−0.13***−0.01*
Expression level−0.34***−0.31***−0.09***
Duplicability−0.38***−0.34***−0.02*
Essentiality−0.35***−0.30***−0.01

P < 0.05, **P < 0.001, and ***P < 10−5.

Multiple Linear Regression of Different Chaperone Families P < 0.05, **P < 0.001, and ***P < 10−5. We then performed a principal component regression analysis using the same predictor variables as above. Table 8 shows numerical data from the analysis, while figure 5 shows these data graphically. Neither Hsp70s dependence nor Hsp90s dependence contributed individually >20% to any significant principal component, but in combination they determine 30% of a component explaining 4.48% of the variance in dN/dS (table 8). In combination, Hsp70s and Hsp90s dependence contribute 3.19% to the total variance in the rate of evolution, which is above the contribution of the number of protein–protein interactions, but below the contributions of expression level, essentiality, or duplicability (table 9).
Table 8

Results from the Principal Component Regression of Different Chaperone Families

Principal Components
123456789All
Percentage of explained variance in
dN/dS4.48***9.58***7.28***0.010.040.26***0.020.010.32***21.99
dN6.08***12.87***9.80***0.000.09*0.54***0.040.040.55***29.99
dS0.56***6.25***3.28***0.000.60***1.90***0.010.12*0.31***13.03
Percent contributions of each variable
 HSP70 dependence0.140.070.020.03 0.37 0.12
 HSP90 dependence0.160.100.010.010.000.03
 HSP100 dependence 0.20 0.050.010.000.000.01
 CTT dependence0.150.040.000.00 0.24 0.22
 SMALL dependence0.020.040.020.900.000.01
 Number of protein–protein interactions 0.21 0.130.030.000.000.05
 Expression level0.05 0.28 0.000.000.14 0.25
 Duplicability0.000.050.040.040.070.13
 Essentiality0.26 0.25 0.010.010.170.19

Note.—We indicate in bold the contributions of a predictor to a component when >20%.

P < 0.05 and ***P < 10−5.

. 5.

—Principal component regression on (A) dN/dS, (B) dN, and (C) dS calculated using divergence data between Saccharomyces cerevisiae and S. paradoxus for 5,532 yeast genes. For each principal component, the height of the bar represents the percent of variance in the rate of evolution explained by the component. The relative contribution of each variable to a principal component is represented with different colors. Table 8 contains the numerical data used to draw this figure.

Table 9

Total Variance Explained by Each Variable in the Principal Component Regression Analysis of Different Chaperone Families

d N /d S d N d S
HSP70 dependence1.48%2.04%1.05%
HSP90 dependence1.71%2.32%0.80%
HSP100 dependence1.43%1.95%0.54%
CTT dependence1.10%1.54%0.88%
SMALL dependence0.64%0.84%0.35%
Number of protein–protein interactions2.66%3.66%1.35%
Expression level4.07%5.55%2.86%
Duplicability5.36%7.25%2.79%
Essentiality3.55%4.84%2.43%
—Principal component regression on (A) dN/dS, (B) dN, and (C) dS calculated using divergence data between Saccharomyces cerevisiae and S. paradoxus for 5,532 yeast genes. For each principal component, the height of the bar represents the percent of variance in the rate of evolution explained by the component. The relative contribution of each variable to a principal component is represented with different colors. Table 8 contains the numerical data used to draw this figure. Results from the Principal Component Regression of Different Chaperone Families Note.—We indicate in bold the contributions of a predictor to a component when >20%. P < 0.05 and ***P < 10−5. Total Variance Explained by Each Variable in the Principal Component Regression Analysis of Different Chaperone Families

Chaperones Increase the Ratio of Nonsynonymous to Synonymous Polymorphism Ratio

For each S. cerevisiae gene, we obtained the nonsynonymous to synonymous polymorphism ratio (dN/dS) from Peter et al. (2018). Chaperone clients exhibit a significantly lower dN/dS ratio (median for clients: 0.2352, median for non-clients: 0.2642, Mann–Whitney U test, P = 2.96 × 10−10). Partial correlation between dN/dS and chaperone dependence controlling for expression level was nonsignificant (ρ = −0.0015, P = 0.9159), and the partial correlation between dN/dS and chaperone dependence controlling for network degree was significantly positive (ρ  =  0.0612, P = 10−5). For each chaperone, we compared the rates of evolution of their clients (n ranged from 2 to 3,102) against the rates of evolution of non-clients (proteins that are not clients of any chaperone, n = 2,152). In all 35 cases, clients exhibited a lower average dN/dS, and in 34 of the cases they also exhibited a lower median dN/dS, with significant differences in 26 cases (Mann–Whitney U test, P < 0.05; table 10). Partial correlations between dN/dS and chaperone dependence controlling for degree were positive in 28 cases (significantly positive in 19 cases) and negative in seven cases (significantly negative in 0 cases).
Table 10

Comparison between the Nonsynonymous to Synonymous Polymorphism Ratio of Clients of Different Yeast Chaperones and Proteins That Are Not Clients of Any Chaperone

ClassChaperoneClients
Non-clients
Mann–Whitney
Partial Correlation
n MedianMean n MedianMean P value Q valueρ P value Q value
CCTsCct21190.21980.25082,152 0.2642 0.3537 0.0053**0.0103*0.05040.05380.0942
CCTsCct31200.24280.24882,152 0.2642 0.3537 0.0157*0.0239*0.05340.0406*0.0748
CCTsCct41620.25670.29252,152 0.2642 0.3537 0.50480.53540.11369.06 × 10−6***3.52 × 10−5**
CCTsCct5350.24470.25892,152 0.2642 0.3537 0.18070.21810.02960.27120.4157
CCTsCct6920.17510.20212,152 0.2642 0.3537 3.65 × 10−6***1.54 × 10−5**−0.02030.44130.6031
CCTsCct7410.20060.20652,152 0.2642 0.3537 0.0022**0.0048**−0.00310.90720.9622
CCTsCct81770.24230.26042,152 0.2642 0.3537 0.0059**0.0109*0.06030.0185*0.0381*
CCTsTcp1460.22310.23402,152 0.2642 0.3537 0.0118*0.0200*0.01400.60080.7510
Hsp70sEcm10640.19250.18912,152 0.2642 0.3537 1.92 × 10−5**6.72 × 10−55**−0.02450.35740.5212
Hsp70sKar2670.20740.24362,152 0.2642 0.3537 0.0120*0.0200*0.03450.19380.3230
Hsp70sLhs1760.19680.21652,152 0.2642 0.3537 0.0004**0.0010**0.00990.70960.8564
Hsp70sSsa12,3800.23750.27142,152 0.2642 0.3537 1.79 × 10−8***2.09 × 10−7***0.08157.13 × 10−7***3.12 × 10−6***
Hsp70sSsa21,8180.23970.27352,152 0.2642 0.3537 2.05 × 10−6***1.03 × 10−5**0.09784.05 × 10−8***2.95 × 10−7***
Hsp70sSsa33000.24290.27822,152 0.2642 0.3537 0.0361*0.05050.10103.91 × 10−5**0.0001**
Hsp70sSsa44330.24530.27712,152 0.2642 0.3537 0.0208*0.0303*0.11934.17 × 10−7***2.09 × 10−6***
Hsp70sSsb13,1020.23550.27302,152 0.2642 0.3537 5.26 × 10−9***1.84 × 10−7***0.07802.38 × 10−7***1.39 × 10−6***
Hsp70sSsb21,1600.24440.27342,152 0.2642 0.3537 9.19 × 10−5**2.47 × 10−4**0.12001.73 × 10−9***2.23 × 10−8***
Hsp70sSsc11900.18070.22792,152 0.2642 0.3537 1.62 × 10−6***1.13 × 10−6***0.01620.52450.6799
Hsp70sSse11,8450.23300.26882,152 0.2642 0.3537 2.44 × 10−8***2.14 × 10−7***0.07323.86 × 10−5**0.0001**
Hsp70sSse22290.25470.27862,152 0.2642 0.3537 0.12790.15990.10572.51 × 10−5**8.79 × 10−5**
Hsp70sSsq1930.16810.19362,152 0.2642 0.3537 2.38 × 10−7***1.39 × 10−6***−0.02890.27320.4157
Hsp70sSsz16280.25330.27982,152 0.2642 0.3537 0.0393*0.05290.13621.05 × 10−9***2.23 × 10−8***
Hsp90sHsc824190.21850.25472,152 0.2642 0.3537 3.95 × 10−6***1.54 × 10−5**0.06720.0047**0.0103*
Hsp90sHsp828280.25250.28312,152 0.2642 0.3537 0.0150*0.0239*0.12821.91 × 10−9***2.23 × 10−8***
Hsp100sHsp787590.22550.25562,152 0.2642 0.3537 1.52 × 10−8***2.09 × 10−7***0.08050.0002**5.00 × 10−4**
Hsp100sHsp1043580.23810.26472,152 0.2642 0.3537 0.0021*0.0048*0.09080.0002**5.00 × 10−4**
SmallHsp31990.23110.28622,152 0.2642 0.3537 0.38220.41800.07590.0038*0.0089*
SmallHsp3220.23900.23902,152 0.2642 0.3537 0.73420.75580.00840.75910.8856
SmallHsp3330.12440.17692,152 0.2642 0.3537 0.30030.3390−0.00370.89160.9622
SmallSno43 0.3315 0.25472,1520.2642 0.3537 0.77030.77030.02070.44800.6031
OtherHsp12880.24760.26452,152 0.2642 0.3537 0.08860.11490.05990.0232*0.0451*
OtherHsp26840.16760.22152,152 0.2642 0.3537 6.75 × 10−5**1.97 × 10−4**−0.00130.95940.9865
OtherHsp423640.25200.28542,152 0.2642 0.3537 0.24990.29160.13164.22 × 10−8***2.95 × 10−7***
OtherHsp60950.19430.21502,152 0.2642 0.3537 6.04 × 10−5**1.92 × 10−4**−0.00320.90430.9622
OtherMcx1420.16550.21652,152 0.2642 0.3537 0.0034*0.0070*0.00050.98650.9865

Note.—For each pair of clients versus non-clients, the highest dN/dS values are shown in bold face. Partial correlations correspond to the Spearman correlation between chaperone dependence and dN/dS, controlling for number of protein–protein interactions. Q values were computed using the Benjamini–Hochberg approach (Benjamini and Hochberg 1995).

P < 0.05, **P < 0.001, and ***P < 10−5.

Comparison between the Nonsynonymous to Synonymous Polymorphism Ratio of Clients of Different Yeast Chaperones and Proteins That Are Not Clients of Any Chaperone Note.—For each pair of clients versus non-clients, the highest dN/dS values are shown in bold face. Partial correlations correspond to the Spearman correlation between chaperone dependence and dN/dS, controlling for number of protein–protein interactions. Q values were computed using the Benjamini–Hochberg approach (Benjamini and Hochberg 1995). P < 0.05, **P < 0.001, and ***P < 10−5. Next, for each group of chaperones (small Hsps, Hsp70s, Hsp90s, Hsp100s, and CCTs), we compared the rates of nonsynonymous to synonymous polymorphism of the clients of any of the group (n ranged from 103 to 947) against those of proteins that are not clients of any chaperone (n = 2,152). In all five cases, clients exhibited lower median and mean dN/dS, with significant differences (Mann–Whitney U test, P < 0.05) in all cases except for the clients of small Hsps (the smallest group; table 11). However, partial correlations between dN/dS and chaperone dependence controlling for network degree was always significantly positive.
Table 11

Comparison between the Nonsynonymous to Synonymous Polymorphism Ratio of Clients of Different Chaperone Families and Proteins That Are Not Clients of Any Chaperone

ClassClients
Non-clients
Mann–Whitney
Partial Correlation
n MedianMean n MedianMean P value Q valueρ P value Q value
CCTs4870.24360.26942,152 0.2642 0.3537 0.0004**0.0001**0.09425.36 × 10−5**0.0020*
Hsp70s8360.24160.29072,152 0.2642 0.3537 0.0009**0.0201*0.05040.0201*0.0352*
Hsp90s9470.24330.27312,152 0.2642 0.3537 0.0001**4.60 × 10−6***0.10249.20 × 10−7***0.0003**
Hsp100s8630.22720.25922,152 0.2642 0.3537 1.82 × 10−8***0.0007**0.07520.0004**0.0120*
Small1030.24790.28752,152 0.2642 0.3537 0.45030.0020*0.08250.0016*0.0066*

Note.—For each pair of clients versus non-clients, the highest dN/dS values are shown in bold face. Partial correlations correspond to the Spearman correlation between chaperone dependence and dN/dS, controlling for number of protein–protein interactions. Q values were computed using the Benjamini–Hochberg approach (Benjamini and Hochberg 1995).

P < 0.05, **P < 0.001, and ***P < 10−5.

Comparison between the Nonsynonymous to Synonymous Polymorphism Ratio of Clients of Different Chaperone Families and Proteins That Are Not Clients of Any Chaperone Note.—For each pair of clients versus non-clients, the highest dN/dS values are shown in bold face. Partial correlations correspond to the Spearman correlation between chaperone dependence and dN/dS, controlling for number of protein–protein interactions. Q values were computed using the Benjamini–Hochberg approach (Benjamini and Hochberg 1995). P < 0.05, **P < 0.001, and ***P < 10−5. Finally, we performed a multivariable analysis to study the effect of chaperone dependence on dN/dS at the intrapopulation level controlling simultaneously for all the studied variables, as we did previously for the divergence data. The results are very similar. We first regressed dN/dS against the five biological factors, and found that all make a significant contribution to the regression and that the overall R2 is 0.17 (table 12). Chaperone dependence was the only factor with a positive coefficient, indicating that chaperone dependence also increases dN/dS within yeast populations. We also performed a principal component regression analysis using the same predictor variables as above. Table 13 shows numerical data from the analysis, while figure 6 shows these data graphically.
Table 12

Multiple Linear Regression of Polymorphism Data

d N /d S
Chaperone dependence0.16***
Number of protein–protein interactions−0.06***
Expression level−0.23***
Duplicability−0.06*
Essentiality−0.20***

P < 0.05 and ***P < 10−5.

Table 13

Results from the Principal Component Regression Analysis of Polymorphism Data

Principal Components
12345All
Percentage of explained variance in
d N /d S 8.630.306.880.910.4117.13
Percent contributions of each variable
Chaperone dependence0.130.06 0.68 0.050.08
Number of protein–protein interactions 0.41 0.020.000.02 0.56
Expression level 0.22 0.04 0.29 0.34 0.12
Duplicability0.00 0.71 0.020.240.03
Essentiality 0.25 0.180.01 0.36 0.22

Note.—We indicate in bold the contributions of a predictor to a component when >20%.

. 6.

—Principal component regression on dN/dS calculated using genetic variants segregating in Saccharomyces cerevisiae for 6,132 yeast genes. For each principal component, the height of the bar represents the percent of variance in the rate of evolution explained by the component. The relative contribution of each variable to a principal component is represented with different colors. Table 13 contains the numerical data used to draw this figure.

—Principal component regression on dN/dS calculated using genetic variants segregating in Saccharomyces cerevisiae for 6,132 yeast genes. For each principal component, the height of the bar represents the percent of variance in the rate of evolution explained by the component. The relative contribution of each variable to a principal component is represented with different colors. Table 13 contains the numerical data used to draw this figure. Multiple Linear Regression of Polymorphism Data P < 0.05 and ***P < 10−5. Results from the Principal Component Regression Analysis of Polymorphism Data Note.—We indicate in bold the contributions of a predictor to a component when >20%. As with divergence data, we found a principal component with a 70% contribution of chaperone dependence and 30% expression level. This component explained ∼7% of the variance of dN/dS (table 13 and fig. 6). Another significant principal component explains 8.6% of the variance. This component is mainly determined by the number of protein–protein interactions, essentiality, and expression level. The other three significant components explained in combination <2% of the variance. In summary, we also found that chaperone dependence was the biological factor explaining the largest fraction of the total variance in the rate of evolution measured as dN/dS (5.87%), explaining a larger fraction of the total variance than expression level (4.23%) and the number of protein–protein interactions (3.76%) (table 14).
Table 14

Total Variance Explained by Each Variable in the Principal Component Regression Analysis of Polymorphism Data

d N /d S
Chaperone dependence5.87%
Number of protein–protein interactions3.76%
Expression level4.23%
Duplicability0.62%
Essentiality2.66%
Total Variance Explained by Each Variable in the Principal Component Regression Analysis of Polymorphism Data Finally, we performed an ANCOVA to evaluate the effect of chaperone dependence on the rate of protein evolution while controlling for the effect of the number of protein–protein interactions, expression level, and essentiality. As the continuous variable in the ANCOVA, we used the principal component of these three variables (principal component 1 in table 13 and fig. 6). We found that chaperone clients evolve on average 19.2% faster than the proteome average (P = 3.6 x 10−11) (fig. 7).
. 7.

—ANCOVA. Chaperone clients (gray points, continuous line) evolve 19.2% above the genome average rate (light points, dashed line) when considering genetic variants segregating in Saccharomyces cerevisiae.

—ANCOVA. Chaperone clients (gray points, continuous line) evolve 19.2% above the genome average rate (light points, dashed line) when considering genetic variants segregating in Saccharomyces cerevisiae.

Discussion

We study how the different yeast chaperones affect the evolutionary rate of their protein clients. In particular, we analyze the effect of chaperone dependence on protein evolution at two very different evolutionary time scales. We first study how chaperone-mediated folding has affected protein evolution over the evolutionary divergence of S. cerevisiae and S. paradoxus. We then study if the same process has left a signature on the patterns of standing genetic variation found in modern wild and domesticated strains of S. cerevisiae (Peter et al. 2018). We find that chaperone-mediated buffering has indeed left a trace on the protein-coding regions of the yeast genome, such that genes encoding chaperone clients (“client genes”) have diverged faster than genes encoding non-client proteins (“non-client genes”) when controlling for their number of protein–protein interactions. We also find that client genes have accumulated more genetic diversity than non-clients genes among natural strains of S. cerevisiae. In a principal component regression analysis, we find that chaperone dependence explains the largest fraction of the observed variance in the rate of evolution at both evolutionary time scales. This contribution of chaperone-mediated folding to the variations on the rate of protein evolution is well above the fraction of the variance explained by other well-known factors that affect protein evolution such as expression level or protein–protein interactions (Pál et al. 2001; Fraser et al. 2002; Drummond et al. 2005). Cost-benefit trade-offs are common in evolution, including protein evolution. Proteins are marginally stable (DePristo et al. 2005) and soluble (Tartaglia et al. 2007) inside a cell and their native structure is sensitive to mutations. Protein stability is a major constraint on protein evolution (Bloom et al. 2006; Zeldovich et al. 2007). Most nonsynonymous mutations diminish protein stability or solubility, and are therefore deleterious (Dobson 1999). Moreover, neofunctionalizing mutations that confer new protein functions, including new protein–protein interactions, tend to be highly destabilizing (Tokuriki et al. 2008; Soskine and Tawfik 2010). Therefore, in the absence of chaperone buffering, the cost of a neofunctionalizing mutation may be larger than its benefit (Tokuriki and Tawfik 2009). Chaperones, by diminishing the negative effect of mutations on protein stability and folding, can promote protein evolution, and potentiate the regulatory or metabolic effect of a protein mutation (Taipale et al. 2010). Our finding that yeast chaperones can accelerate protein evolution is in line with previous observations that chaperones can act as evolutionary capacitors (Queitsch et al. 2002; Rutherford 2003; Jarosz and Lindquist 2010), buffer the destabilizing effect of mutations (Tokuriki and Tawfik 2009), facilitate the divergence of gene duplicates (Lachowiec et al. 2013), and ultimately allow proteins to explore a larger fraction of their sequence space (Williams and Fares 2010; Pechmann and Frydman 2014; Aguilar-Rodríguez et al. 2016; Kadibalban et al. 2016). However, it is important to notice that chaperones do not just modify the effects of mutations affecting protein stability or folding. A chaperone can also modify (either buffer or potentiate) the fitness or phenotypic effects of mutations in proteins that do not have a direct functional relationship with it. For example, many of the protein clients of Hsp90 are transcription factors and signaling proteins (Taipale et al. 2010; Zabinsky et al. 2018). Therefore, the modifying effect of Hsp90 can percolate throughout the molecular networks of the cell affecting mutations in many genes that do not have a direct physical or functional relationship with Hsp90. Even if chaperone-buffered genetic variants are only rarely acquired, they could be enriched in a population if stabilizing selection does not remove them (because their deleterious phenotypic consequences are masked by a chaperone). A recent study has found evidence for this hypothesis among Hsp90-dependent variants in S. cerevisiae that affect cell size and shape (Geiler-Samerotte et al. 2016). We also find evidence for this enrichment of cryptic genetic variants within client genes among genetic variants segregating in S. cerevisiae. Nonsynonymous mutations that allow the establishment of new physical interactions with other proteins are a class of neofunctionalizing mutations that can be highly destabilizing (Pechmann and Frydman 2014). Therefore, some chaperones can also buffer mutations that rewire protein interactions, thus promoting the evolution of protein networks (Pechmann and Frydman 2014), and perhaps explaining why chaperone clients tend to be well-connected in such networks, as we observed here. Furthermore, chaperones and their clients coevolve in a process where sequence changes in the chaperone may lead to compensatory changes in their clients and further rewiring of the protein networks they form (Koubkova-Yu et al. 2018). In a multivariable statistical analysis, we find that the chaperones affecting rates of protein evolution belong to two major chaperone families: Hsp70s and Hsp90s. While there is ample evidence that Hsp90s can accelerate the rate of protein evolution in other eukaryotic species (Lachowiec et al. 2013, 2015; Pechmann and Frydman 2014), the evidence for eukaryotic Hsp70 chaperones having a similar effect is not so abundant. A previous study found that the ribosome-associated Hsp70 SSB chaperone that preferentially binds long and disordered nascent polypeptide chains accelerates the rate of accumulation of mutations likely to be destabilizing among weakly interacting clients (Pechmann and Frydman 2014). In a previous study, we found that bacterial DnaK, which belongs to the same major chaperone family, also accelerates protein evolution using a combination of experimental and comparative genomics approaches (Aguilar-Rodríguez et al. 2016). While it has been shown before that the chaperonin GroEL accelerates protein evolution (Bogumil and Dagan 2010; Williams and Fares 2010), we do not find good evidence here that the eukaryotic chaperonin system CCT, present in eukarya and archaea but absent from bacteria, has the same effect on protein evolution. We find that the chaperone Hsp104 from the Hsp100 family accelerates the evolution of its protein clients when controlling for number of protein–protein interactions. This could be the first observation that this important chaperone could affect protein evolutionary rates. However, we do not observe any effect of the family Hsp100 (Hsp78 and Hsp104) when controlling for possible confounding variables in a multiple linear regression and in a principal component regression analysis. Finally, we do not detect any significant effect of small heat shock proteins in the rate of evolution of their clients. In summary, we analyzed the evolution of proteins that are subjected to folding assisted by different chaperones in the complex yeast chaperone network over two different evolutionary time scales. Our comparative approach indicates that chaperone-assisted folding increases the rate of protein evolution when properly controlling for confounding factors at both time scales. We show how protein chaperones, by virtue of their role in modulating protein genotype–phenotype maps, have a disproportionate effect on the evolution of the protein-coding regions of a genome. Our results highlight the importance of integrating different cellular factors when studying protein sequence evolution.
  8 in total

1.  Richard Dickerson, Molecular Clocks, and Rates of Protein Evolution.

Authors:  David Alvarez-Ponce
Journal:  J Mol Evol       Date:  2020-11-18       Impact factor: 2.395

2.  On the evolution of chaperones and cochaperones and the expansion of proteomes across the Tree of Life.

Authors:  Mathieu E Rebeaud; Saurav Mallik; Pierre Goloubinoff; Dan S Tawfik
Journal:  Proc Natl Acad Sci U S A       Date:  2021-05-25       Impact factor: 11.205

3.  Universal Constraints on Protein Evolution in the Long-Term Evolution Experiment with Escherichia coli.

Authors:  Rohan Maddamsetti
Journal:  Genome Biol Evol       Date:  2021-06-08       Impact factor: 3.416

4.  Differential maturation and chaperone dependence of the paralogous protein kinases DYRK1A and DYRK1B.

Authors:  Marco Papenfuss; Svenja Lützow; Gerrit Wilms; Aaron Babendreyer; Maren Flaßhoff; Conrad Kunick; Walter Becker
Journal:  Sci Rep       Date:  2022-02-14       Impact factor: 4.379

5.  Variables Influencing Differences in Sequence Conservation in the Fission Yeast Schizosaccharomyces pombe.

Authors:  Simon Emanuel Harnqvist; Cooper Alastair Grace; Daniel Charlton Jeffares
Journal:  J Mol Evol       Date:  2021-08-26       Impact factor: 2.395

Review 6.  Hsp90 interaction networks in fungi-tools and techniques.

Authors:  Julia L Crunden; Stephanie Diezmann
Journal:  FEMS Yeast Res       Date:  2021-11-16       Impact factor: 2.923

7.  GroEL/S Overexpression Helps to Purge Deleterious Mutations and Reduce Genetic Diversity during Adaptive Protein Evolution.

Authors:  Bharat Ravi Iyengar; Andreas Wagner
Journal:  Mol Biol Evol       Date:  2022-06-02       Impact factor: 8.800

Review 8.  Nucleosome proteostasis and histone turnover.

Authors:  Adrian Arrieta; Thomas M Vondriska
Journal:  Front Mol Biosci       Date:  2022-09-30
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.