| Literature DB >> 24277689 |
Jacek Kominek1, Jaroslaw Marszalek, Cécile Neuvéglise, Elizabeth A Craig, Barry L Williams.
Abstract
Hsp70 molecular chaperones are ubiquitous. By preventing aggregation, promoting folding, and regulating degradation, Hsp70s are major factors in the ability of cells to maintain proteostasis. Despite a wealth of functional information, little is understood about the evolutionary dynamics of Hsp70s. We undertook an analysis of Hsp70s in the fungal clade Ascomycota. Using the well-characterized 14 Hsp70s of Saccharomyces cerevisiae, we identified 491 orthologs from 53 genomes. Saccharomyces cerevisiae Hsp70s fall into seven subfamilies: four canonical-type Hsp70 chaperones (SSA, SSB, KAR, and SSC) and three atypical Hsp70s (SSE, SSZ, and LHS) that play regulatory roles, modulating the activity of canonical Hsp70 partners. Each of the 53 surveyed genomes harbored at least one member of each subfamily, and thus establishing these seven Hsp70s as units of function and evolution. Genomes of some species contained only one member of each subfamily that is only seven Hsp70s. Overall, members of each subfamily formed a monophyletic group, suggesting that each diversified from their corresponding ancestral gene present in the common ancestor of all surveyed species. However, the pattern of evolution varied across subfamilies. At one extreme, members of the SSB subfamily evolved under concerted evolution. At the other extreme, SSA and SSC subfamilies exhibited a high degree of copy number dynamics, consistent with a birth-death mode of evolution. KAR, SSE, SSZ, and LHS subfamilies evolved in a simple divergent mode with little copy number dynamics. Together, our data revealed that the evolutionary history of this highly conserved and ubiquitous protein family was surprising complex and dynamic.Entities:
Keywords: Ascomycota genomes; birth-and-death; concerted evolution; gene duplication; molecular chaperones; multi gene family
Mesh:
Substances:
Year: 2013 PMID: 24277689 PMCID: PMC3879978 DOI: 10.1093/gbe/evt192
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
FPhylogenetic distribution, copy number dynamics, and protein phylogeny of Hsp70s in fungi. (A) Phylogenetic distribution of the Hsp70 orthologs from 7 subfamilies across the surveyed fungal species. Number of orthologs identified for each Hsp70 subfamily is indicated; in parentheses the number of orthologs including putative orthologs (pseudogenes or partial hits due to low quality genomic data (supplementary fig. S1, Supplementary Material online). The SSC column includes orthologs of SSC3 and SSQ1 genes. The species tree was based on various sources (see Materials and Methods for details). (B) Maximum-likelihood tree of amino acid sequences from all identified Hsp70 orthologs. Scale is in expected amino acid substitutions per site. **Bootstrap ≥ 95, *bootstrap ≥ 70.
FProtein phylogeny and copy number dynamics of the SSA subfamily. (A) Bayesian tree of amino acid sequences from the SSA subfamily. Scale is in expected amino acid substitutions per site. **Posterior probability ≥ 0.95, *posterior probability ≥ 0.7. (B) Cladogram showing the hypothetical scenario of SSA evolution in the Yarrowia clade. Gene copies named A–G exhibit synteny across at least two species, gene copies named X1–X6 are taxon specific and gene copies named “p” are putative pseudogenes. (C) Cladogram showing the hypothetical scenario of SSA evolution in the CTG and Saccharomycetaceae clades, gene copies named X1 and X2 are taxon specific. (D) Copy number dynamics in the SSA subfamily. Number of species for each clade indicated in parenthesis.
Global Analysis of Hsp70 Subfamilies in Ascomycota
| Subfamily | Sequence Statistics | Cellular Localization | ||
|---|---|---|---|---|
| No. of Orthologs | Lengthb | Identity | ||
| Canonical | ||||
| SSA | 117 | 645 ± 5 | 81.68 ± 4.81 | Cytosol |
| SSB | 61 | 613 ± 1 | 80.45 ± 8.80 | Cytosol |
| KAR | 53 | 674 ± 9 | 72.46 ± 6.99 | ER |
| SSC | 53 | 658 ± 13 | 74.51 ± 8.08 | Mitochondria |
| SSQ | 26 | 648 ± 8 | 59.63 ± 10.64 | Mitochondria |
| Atypical | ||||
| SSE | 60 | 703 ± 18 | 55.62 ± 13.13 | Cytosol |
| SSZ | 54 | 550 ± 15 | 49.44 ± 13.40 | Cytosol |
| LHS | 52 | 947 ± 61 | 26.42 ± 12.76 | ER |
aNumber of identified orthologs, excluding putative orthologs and sequences with unique internal deletions. bMean sequence length ± SD.
cMean pairwise amino acid sequence identity ± SD.
FAmino acid usage among Hsp70 subfamilies. (A) Distribution of sites with indicated amino acid usage, defined as the number of different amino acids observed per site in a multiple sequence alignment. (Inlet) Values of the shape parameter α of the gamma distribution fitted to the amino acid usage data for each Hsp70 subfamily, as indicated. (B) Maximum amino acid usage covering the indicated proportion of sites in each Hsp70 subfamily.
Codon Usage and Evolutionary Rates of the Hsp70 Genes in Saccharomyces cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, and S. uvarum
| Hsp70 | Codon Usage | Protein Abundance | Evolutionary Rates | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Total | Invariant | % Invariant | CAI | ENC | d | d | d | ||
| SSA1 | 642 | 361 | 56.2 | 0.65 ± 0.06 | 29.85 ± 1.34 | 8,178 | 0.003 ± 0.002 | 0.259 ± 0.169 | 0.013 |
| SSA2 | 639 | 426 | 66.7 | 0.78 ± 0.02 | 27.40 ± 0.43 | 7,986 | 0.002 ± 0.001 | 0.234 ± 0.110 | 0.010 |
| SSA3 | 647 | 188 | 29.1 | 0.19 ± 0.01 | 47.29 ± 3.20 | 290 | 0.007 ± 0.003 | 0.292 ± 0.127 | 0.025 |
| SSA4 | 642 | 175 | 27.3 | 0.21 ± 0.03 | 43.09 ± 1.57 | 475 | 0.006 ± 0.004 | 0.331 ± 0.220 | 0.019 |
| SSB1 | 613 | 420 | 68.5 | 0.80 ± 0.02 | 27.57 ± 0.71 | 2,320 | 0.002 ± 0.001 | 0.186 ± 0.122 | 0.010 |
| SSB2 | 613 | 405 | 66.1 | 0.77 ± 0.02 | 27.83 ± 0.40 | 2,791 | 0.002 ± 0.001 | 0.176 ± 0.083 | 0.012 |
| KAR2 | 682 | 300 | 44.0 | 0.45 ± 0.04 | 35.81 ± 0.74 | 523 | 0.004 ± 0.003 | 0.238 ± 0.145 | 0.018 |
| SSC1 | 651 | 285 | 43.8 | 0.51 ± 0.02 | 35.05 ± 1.36 | 709 | 0.004 ± 0.003 | 0.257 ± 0.186 | 0.016 |
| SSQ1 | 654 | 185 | 28.3 | 0.14 ± 0.01 | 49.93 ± 2.62 | 26.4 | 0.014 ± 0.006 | 0.307 ± 0.123 | 0.045 |
| SSC3 (d) | 643 | 181 | 28.1 | 0.20 ± 0.01 | 47.50 ± 0.92 | 86 | 0.027 ± 0.011 | 0.372 ± 0.151 | 0.073 |
| SSC3 (c) | 630 | 281 | 44.6 | 0.52 ± 0.04 | 34.24 ± 0.31 | n/a | n/a | n/a | n/a |
| SSE1 | 693 | 296 | 42.7 | 0.55 ± 0.05 | 34.04 ± 1.70 | 1,959 | 0.007 ± 0.004 | 0.289 ± 0.178 | 0.025 |
| SSE2 | 693 | 180 | 26.0 | 0.20 ± 0.01 | 48.91 ± 1.62 | 155 | 0.017 ± 0.009 | 0.319 ± 0.170 | 0.053 |
| SSZ1 | 538 | 186 | 34.6 | 0.44 ± 0.05 | 36.02 ± 2.09 | 1,040 | 0.011 ± 0.007 | 0.274 ± 0.194 | 0.038 |
| LHS1 | 881 | 191 | 21.7 | 0.15 ± 0.01 | 50.69 ± 2.08 | 20.3 | 0.038 ± 0.022 | 0.302 ± 0.179 | 0.125 |
Note.—n/a, not applicable.
aTotal number of aligned codons.
bNumber of identical codons in the analyzed species.
cCAI shown as mean ± SD, for statistical significance see supplementary figure S10, Supplementary Material online.
dProtein abundance in parts-per-million (Wang et al. 2012).
eSSC3 (d)—divergent SSC3 orthologs from S. cerevisiae, S. paradoxus, S. kudriavzevii, and S. uvarum, that form a separate clade on the phylogenetic tree of the SSC subfamily (supplementary fig. S4, Supplementary Material online).
fSSC3 (c)—gene-converted SSC3 orthologs from Candida glabrata and Vanderwaltozyma polyspora, that cluster with SSC1 orthologs from their respective species on the phylogenetic tree of the SSC subfamily (supplementary fig. S4, Supplementary Material online).
FSequence evolution and regulatory divergence of SSB subfamily. (A) Bayesian trees of nucleotide sequences of SSB orthologs from post-WGD Saccharomycetaceae species: (left) the 5'-IGS, (middle) open reading frames (SSB ORF), and (right) the 3'-IGS. The SSB ORF tree was rooted using the SSB1 ortholog from Candida albicans and is shown as cladogram for clarity (for the corresponding phylogram, see supplementary fig. S2, Supplementary Material online). Scale is in expected nucleotide substitutions per site. **Posterior probability ≥ 0.95, *posterior probability ≥ 0.7. Differences aa/nn indicates number of amino acid (aa) and nucleotide (nn) differences between paralogous SSB1/SSB2 sequences from a given species. Residues listed along branches indicate amino acid substitutions between each SSB1 and SSB2 pair. Substitutions present in more than one species are marked in bold. (B) Schematic representation of predicted TF-binding sites present in the 5'-IGSs of SSB1 and SSB2 orthologs from the indicated post-WGD species. Underlined TF-binding sites were not identified in the pre-WGD species. The diagram is not drawn to scale. For clarity, specific locations of individual TF binding sites are not indicated.
FAmino acid sequence divergence of SSQ1 and SSC3 orthologs. (A) Summary of sequence divergence analysis. Group A: SSC1preSSQ1—27 sequences of SSC1 orthologs from species predating the emergence of SSQ1; SSC1preSSC3—6 sequences of SSC1 orthologs from species predating the emergence of SSC3 (supplementary fig. S4, Supplementary Material online, for details). Group B: 26 sequences of SSQ1 orthologs and 4 sequences of SSC3 orthologs. Conserved sites—number of sites with residues identical in group A and B. Relaxed sites—number of sites that are invariant in A but variable in B. Fixed sites—number of sites that are variable in A but invariant in B. Switched sites—number of sites that are invariant within group A and group B but occupied by different amino acids in each group. Radical—number of sites where substitution from the most prevalent residue in group A to the most prevalent residue in group B had the BLOSUM62 score equal to or less than zero. Conservative—number of sites where substitution from the most prevalent residue in group A to the most prevalent residue in group B had the BLOSUM62 score strictly greater than zero. (B) Approximate localization of the radical fixed and radical switched sites within sequences orthologous to SSQ1 and SSC3 as indicated. Fixed sites represented by bars above the sequence and switched sites marked by bars below sequence with radical sites marked by their sequence position. Pale fragments indicate patches of at least 9 consecutive gaps in the alignment of sequences from groups A and B, and thus not subjected to the analysis. SP, signal peptide; L, Linker; SBD-α, substrate-binding domain alpha-helical lid.
FAmino acid sequence divergence of atypical SSE, SSZ, and LHS subfamilies. (A) Summary of sequence divergence analysis. Group A, canonical Hsp70 homologs (223 sequences) belonging to the SSA, KAR, and SSC subfamilies. Group B, Hsp70 orthologs belonging to the SSE (60 sequences), SSZ (54 sequences), and LHS (52 sequences) subfamilies, as indicated. Conserved sites—number of sites with residues identical in group A and B. Relaxed sites—number of sites that are invariant in A but variable in B. Fixed sites—number of sites that are variable in A but invariant in B. Switched sites—number of sites that are invariant within group A and group B but occupied by different amino acids in each group. Radical—number of sites where substitution from the most prevalent residue in group A to the most prevalent residue in group B had the BLOSUM62 score equal to or less than zero. Conservative—number of sites where substitution from the most prevalent residue in group A to the most prevalent residue in group B had the BLOSUM62 score strictly greater than zero. (B) Approximate localization of the radical fixed and radical switched sites within sequences orthologous to SSE1, SSZ1, and LHS1, as indicated. Fixed sites marked by bars above the sequence and switched sites marked by bars below sequence with radical sites marked by their sequence position. Pale fragments indicate patches of at least nine consecutive gaps in the alignment of sequences from groups A and B, and thus not subjected to the analysis. SP, signal peptide; L, Linker; SBD-α, substrate-binding domain alpha-helical lid.