| Literature DB >> 19133069 |
Ossi Turunen1, Ralph Seelke, Jed Macosko.
Abstract
A fairly recent whole-genome duplication (WGD) event in yeast enables the effects of gene duplication and subsequent functional divergence to be characterized. We examined 15 ohnolog pairs (i.e. paralogs from a WGD) out of c. 500 Saccharomyces cerevisiae ohnolog pairs that have persisted over an estimated 100 million years of evolution. These 15 pairs were chosen for their high levels of asymmetry, i.e. within the pair, one ohnolog had evolved much faster than the other. Sequence comparisons of the 15 pairs revealed that the faster evolving duplicated genes typically appear to have experienced partially--but not fully--relaxed negative selection as evidenced by an average nonsynonymous/synonymous substitution ratio (dN/dS(avg)=0.44) that is higher than the slow-evolving genes' ratio (dN/dS(avg)=0.14) but still <1. Increased number of insertions and deletions in the fast-evolving genes also indicated loosened structural constraints. Sequence and structural comparisons indicated that a subset of these pairs had significant differences in their catalytically important residues and active or cofactor-binding sites. A literature survey revealed that several of the fast-evolving genes have gained a specialized function. Our results indicate that subfunctionalization and even neofunctionalization has occurred along with degenerative evolution, in which unneeded functions were destroyed by mutations.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19133069 PMCID: PMC2704937 DOI: 10.1111/j.1567-1364.2008.00451.x
Source DB: PubMed Journal: FEMS Yeast Res ISSN: 1567-1356 Impact factor: 2.796
Size distribution of insertions and deletions (indels) among the 15 duplicated gene pairs
| Indel size | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | >10 | Totals | Amino acids |
|---|---|---|---|---|---|---|---|---|---|---|---|
| S INS | 3 | – | 1 | – | – | – | – | – | – | 4 | 6 |
| S DEL | 1 | 1 | 1 | 1 | – | – | – | – | 2 | 6 | 40 |
| F INS | 19 | 5 | 2 | – | 1 | 2 | 1 | – | – | 30 | 59 |
| F DEL | 6 | 4 | 3 | 1 | 1 | 3 | 1 | 4 | 6 | 29 | 221 |
| Totals | 29 | 10 | 7 | 2 | 2 | 5 | 2 | 4 | 8 | 69 | 326 |
The indels were determined in comparison to the corresponding Kluyveromyces waltii gene. The N- and C-terminal length variation was excluded.
Number of amino acids changed by the indels.
S, slow-evolving gene; F, fast-evolving gene; INS, insertion; DEL, deletion.
Divergence characteristics of duplicated yeast genes
| Gene pair | Amino acids | pI | Identity of y2–y1 | Identity with | Indels | d | |
|---|---|---|---|---|---|---|---|
| 8105 | 499 | 7.46 | 89 | 1 | 0.065 [0.010] | ||
| 493 | 5.08 | 41 | 41 | 8 | 0.524 [0.066] | ||
| 23 042 | 198 | 5.61 | 77 | 0 | 0.183 [0.040] | ||
| 210 | 5.03 | 47 | 54 | 5 | 0.206 [0.088] | ||
| 22 001 | 375 | 8.92 | 78 | 0 | 0.115 [0.021] | ||
| 375 | 7.55 | 44 | 42 | 4 | 0.461 [0.076] | ||
| 6157 | 2233 | 6.22 | 81 | 3 | 0.156 [0.011] | ||
| 2273 | 8.05 | 55 | 55 | 11 | 0.685 [0.041] | ||
| 15 007 | 399 | 5.01 | 83 | 0 | 0.119 [0.022] | ||
| 345 | 4.96 | 55 | 56 | 3 | 0.497 [0.065] | ||
| 24 238 | 549 | 5.36 | 57 | 0 | 0.163 [0.030] | ||
| 320 | 10.48 | 21 | 21 | 9 | 1.299 [0.222] | ||
| 2978 | 210 | 5.12 | 78 | 2 | 0.112 [0.039] | ||
| 220 | 4.99 | 64 | 57 | 4 | 0.291 [0.065] | ||
| 7837 | 304 | 5.26 | 84 | 0 | 0.082 [0.022] | ||
| 310 | 7.95 | 64 | 64 | 1 | 0.254 [0.048] | ||
| 5576 | 484 | 5.07 | 76 | 0.092 [0.021] | |||
| 433 | 6.27 | 53 | 57 | 0.348 [0.048] | |||
| 4569 | 352 | 5.78 | 62 | 2 | 0.238 [0.041] | ||
| 300 | 8.15 | 32 | 27 | 0.830 [0.124] | |||
| 6945 | 500 | 7.66 | 86 | 0 | 0.258 [0.039] | ||
| 506 | 6.90 | 71 | 70 | 0 | 0.168 [0.023] | ||
| 23 198 | 346 | 6.66 | 86 | 1 | 0.217 [0.038] | ||
| 351 | 6.34 | 76 | 74 | 0 | 0.185 [0.029] | ||
| 3922 | 667 | 5.88 | 82 | 0 | 0.127 [0.016] | ||
| 618 | 7.50 | 59 | 56 | 4 | 0.329 [0.037] | ||
| 1862 | 138 | 6.93 | 84 | 0 | 0.097 [0.030] | ||
| 142 | 8.04 | 62 | 63 | 0 | 0.224 [0.067] | ||
| 13 644 | 347 | 10.35 | 78 | 0 | 0.132 [0.025] | ||
| 310 | 10.2 | 59 | 58 | 2 | 0.345 [0.049] |
Numerical parameters are shown to measure the divergence of slow (upper gene) and fast (lower gene) evolving genes from the corresponding Kluyveromyces waltii gene used as an outgroup. Sequence identity, number of indels and dN/dS ratio are calculated from the comparison with the singleton K. waltii gene. Lower in the gene pair is the fast-evolving gene.
Theoretical pI.
y1, slow-evolving yeast gene; y2, fast-evolving yeast gene.
Identity with the corresponding single K. waltii gene.
Number of intragenic insertions and deletions when compared to K. waltii gene.
Modified Nei–Gojobori method with Jukes–Kantor correction (transition/transversion ratio was 3) was used to estimate the pairwise distances to the gene (analysis using mega 3.1). SE: standard error.
The differing dN/dS ratios are mainly due to differences outside the BC and CT domains (see Table S1B).
Concern the overlapping regions between CET1, CTL1 and K. waltii gene (whereas the pI value concerns the full protein of CET1). In calculating the dN/dS ratio, the regions corresponding to the 54 N-terminal amino acids in CTL1 were excluded.
The corresponding K. waltii gene (no: 7837) is apparently missing 25% of its sequence from the N-terminus; the reported sequence analysis concerns only the region present in K. waltii gene.
The alignments are unclear in the C-terminal region in which the indels occur.
Predicted and Experimental Locations
| Gene 1 (slow evolving) | Gene 2 (fast evolving) | ||||||
|---|---|---|---|---|---|---|---|
| Name: common/ systematic | Predicted location | Huh | Other experimental location(s) | Name: common/ systematic | Predicted location | Huh | Other experimental location(s) |
| UGP1/YKL035W | Weak nucleus | Cytoplasm | – | YHL012W/YHL012W | Nucleus | – | – |
| PST2/YDR032C | Endoplasmic reticulum (ER) | Cytoplasm (punctuate composite) | Cytoplasm (punctuate), chromatin | RFS1/YBR052C | No clear prediction | Cytoplasm (punctuate) composite) | Cytoplasm (punctuate), chromatin |
| MCK1/YNL307C | Mitochondrial | Cytoplasm, nucleus; | Centromere | YGK3/YOL128C | Nucleus | – | – |
| ACC1/YNR016C | Cytoplasm | Cytoplasm (punctuate composite) | Cytoplasm | HFA1/YMR207C | Mitochondria | Mitochondria | Mitochondria |
| RNR2/YJL026W | Cytoplasm, nucleus | Cytoplasm, nucleus | Cytoplasm, nucleus | RNR4/YGR180C | Cytoplasm, nucleus | Cytoplasm, nucleus | Cytoplasm, nucleus |
| CET1/YPL228W | Nucleus | Nucleus | Nucleus | CTL1/YMR180C | Mitochondria, nucleus | – | Cytoplasm, nucleus |
| VPS21/YOR089C | ER | Cytoplasm, nucleus | Transport vesicles | YPT53/YNL093W | ER | – | Transport vesicles |
| SEC14/YMR079W | Cytoplasm | Cytoplasm, nucleus | Cytoplasm | SFH1/YKL091C | Nucleus | Cytoplasm, nucleus | Nucleus |
| SLT2/YHR030C | Nucleus | Cytoplasm, nucleus | Nucleus, bud tip | YKL161C/YKL161C | Nucleus | – | – |
| GCS1/YDL226C | Cytoplasm | Cytoplasm | – | SPS18/YNL204C | Nucleus | – | – |
| CDC19/YAL038W | Cytoplasm | Cytoplasm | Cytoplasm | PYK2/YOR347C | Nucleus | Cytoplasm | Cytoplasm, mitochondria |
| ADH1/YOL086C | Cytoplasm | – | Cytoplasm | ADH5/YBR145W | Nucleus | Cytoplasm, nucleus | cytoplasm, nucleus |
| GRS1/YBR121C | Cytoplasm | Cytoplasm | Cytoplasm, mitochondria | GRS2/YPR081C | Nucleus | Cytoplasm | Cytoplasm |
| ERV14/YGL054C | Integral membrane | ER, vacuole | ER–golgi | ERV15/YBR210W | Integral membrane | – | – |
| FEN1/YCR034W | Integral membrane | ER | ER | ELO1/YJL196C | Integral membrane | – | ER |
Prediction of cellular location signal was done primarily at the Yeast Protein Localization Server (http://bioinfo.mbb.yale.edu/genome/localize/).
Location of GFP-tagged proteins from Huh .
References providing the location are found in the Supporting Information.
Divergence of the active sites and binding sites in the duplicated gene pairs
| Gene pair | Gene function ( | Comments (for literature references see the main text and the Supporting information) | |
|---|---|---|---|
| UDP-glucose pyrophosphorylase Unknown function | A B | The potential active site and the glucose-1-phosphate and PPi-binding sites of | |
| Both flavodoxin-fold proteins with chromatin association | A B A′ B′ | A partially modeled flavin mononucleotide (FMN)-binding pocket is conserved in | |
| GSK-3 homolog GSK-3 homolog, diverged substrate specificity? | A B A* | ||
| Both acetyl-CoA-carboxylases | A B′ A B′ | In comparison with | |
| Ribonucleotide reductase Stabilizing component for ribonucleotide reductase | A′B′ | ||
| RNA triphosphatase RNA degradation and processing? | A B A | ||
| Both | A B A B* | The GTP-binding and the protein family sequence features are largely conserved in | |
| PI/PC transfer protein Weak phospholipid transfer? | A B A B*C? | SEC14 is a phosphatidylinositol/phosphatidylcholine transfer protein. Functional sites are conserved in | |
| MAP Kinase Kinase with altered substrate specificity? | A B A′B* | ||
| ARF-GAP protein Unknown function in sporulation | A B A′B*C? | ||
| Both pyruvate kinases | A B A B′ | ||
| Both alcohol dehydrogenases | A B A′B′ | ||
| Glycyl tRNA synthetase Defective glycyl tRNA synthetase? | A B A* | GRS1 is a glycyl-tRNA-synthetase. Unlike | |
| Cargo receptor cycling between ER and Golgi Similar function to | A B A′ | ||
| Long chain fatty acid elongase Short chain elongase | A′B A′B* |
*change: A and B represent gene functions. Markings (′, * and −) represent degree of change (e.g. A′, minor change in function A; A*, major change in function A; and B, deleted function B). C? represents a possible new function. The distinction between major and minor change is not always clear; some minor changes may prove to be major upon further investigation, and vice versa. See Fig. 4 for more details.
Fig. 4Schematic presentation of possible divergence modes. The ancestral protein with subfunctions A and B was duplicated in the WGD, and this figure shows schematically how the functional divergence has led to many types of changes in the diverging gene pair. Reduction of functions is common among the fast-evolving genes in the group of 15 gene pairs. Fast-evolving genes have also adopted new roles in the yeast cell. A′B′ minor changes in the subfunctions; A*, B*, novel functional properties (e.g. changes in location, interaction with substrate, or protein–protein interactions); (with strikethrough), A′B′ deletion of subfunctions; and C, completely new (sub)function.
Fig. 1Examples of functional sites, in which the fast-evolving yeast protein has diverged significantly. (a) Sulfate binding site in the mouse GSK-3β and yeast proteins. (b) Phosphate anchor motif GXGXXG in MAP kinases. (c) Binding site (bold and underlined) of Rattus norvegicus ARFGAP1 for ADP ribosylation factor ARF1 and the corresponding sites in yeast proteins (Rattus sites are from crystal structure; Goldberg, 1999). (d) Conserved iron ligand binding site in diiron center of RNR proteins. (e) Key residues reported to be important for UDP-glucose pyrophosphorylase activity (Geisler ). Differing sites in fast-evolving genes are shown in bold and underlined (a–b and d–e).
Fig. 2Sulfate-binding site. (a) The sulfate-binding site is shown for the mouse GSK-3β (1gng). Dotted green lines show hydrogen bonding to sulfate. (b) The residues corresponding to the GSK3 sulfate-binding site in YGK3 (see Fig. 1a) were introduced into the 1gng structure in Swiss-PdbViewer (1gng numbering). The side chains of Gln at position 205 (Gln197 in YGK3) and Asn-213 (Asn205 in YGK3) were rotated at some degree to form hydrogen bonds to the sulfate oxygens. Phosphorylated tyrosine (Tyr216 in GSK-3β) is also shown. Pictures were created using Swiss-PdbViewer.
Fig. 3Example of a highly conserved active site in a highly diverged protein. CTL1 is an extremely truncated version of yeast RNA triphosphatase (CET1), which displays only 21% identity in the remaining region. Out of 15 catalytically important residues (shown in bold), only one, histidine (bold and underlined) is different in CTL1 indicating a strong purifying selection in these positions (Lehman ). Sites important for dimerization in CET1 are shown underlined (Lehman ). Binding site for CEG1 protein (WAQKW) in CET1 is shown in italics and underlined (Ho ).