| Literature DB >> 28459980 |
Mario A Fares1,2,3, Beatriz Sabater-Muñoz1,2,3, Christina Toft2,4,5.
Abstract
Gene duplication generates new genetic material, which has been shown to lead to major innovations in unicellular and multicellular organisms. A whole-genome duplication occurred in the ancestor of Saccharomyces yeast species but 92% of duplicates returned to single-copy genes shortly after duplication. The persisting duplicated genes in Saccharomyces led to the origin of major metabolic innovations, which have been the source of the unique biotechnological capabilities in the Baker's yeast Saccharomyces cerevisiae. What factors have determined the fate of duplicated genes remains unknown. Here, we report the first demonstration that the local genome mutation and transcription rates determine the fate of duplicates. We show, for the first time, a preferential location of duplicated genes in the mutational and transcriptional hotspots of S. cerevisiae genome. The mechanism of duplication matters, with whole-genome duplicates exhibiting different preservation trends compared to small-scale duplicates. Genome mutational and transcriptional hotspots are rich in duplicates with large repetitive promoter elements. Saccharomyces cerevisiae shows more tolerance to deleterious mutations in duplicates with repetitive promoter elements, which in turn exhibit higher transcriptional plasticity against environmental perturbations. Our data demonstrate that the genome traps duplicates through the accelerated regulatory and functional divergence of their gene copies providing a source of novel adaptations in yeast.Entities:
Keywords: adaptations; environmental stress; expression genome hotspots; gene duplication; genetic redundancy; mutational genome hotspots; phenotypic plasticity
Mesh:
Year: 2017 PMID: 28459980 PMCID: PMC5433386 DOI: 10.1093/gbe/evx085
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
F—Duplicated genes persist in genome evolutionary hotspots. (a) We estimated the mean nonsynonymous nucleotide substitutions per nonsynonymous site (dN: black column) and the synonymous substitutions per synonymous site (dS; white column), for the three singleton genes (Locus tag genes in the white rectangles) immediately flanking a duplicated gene (black rectangle) at either side. (b) The mean dN for each region containing a duplicate (GRDs) within each chromosome was calculated and compared to that of genome regions containing only singletons (GRSs). We identified seven chromosomes in which GRDs exhibited significantly higher dN than GRSs (red-labeled roman numbers in x axis). (c) For each of the duplicates contained in each GRD of each chromosome, we searched for its paralogue elsewhere in the genome. Then we compared the dN of both these groups and found that when one GRD of a chromosome exhibited a mean dN below the mean dN for GRSs (white boxes), their paralogs exhibited the inverse pattern (gray boxes), and vice versa. Red-labeled chromosomal numbers in the x axis indicate those for which evidence exist that at least one of the paralogs is in a GRD with high dN.
Genome Regions with Duplicates (GRD) Are Mutational, Transcriptional, and Interaction Hotspots Compared with Genome Regions with Singletons (GRS)
| GRDs | GRSs | |||||
|---|---|---|---|---|---|---|
| dN | 0.04 | 0.038 | 2.24 | 4398.15 | 0.024 | 3.13 × 10−5 |
| dS | 0.30 | 0.29 | 2.08 | 4776.90 | 0.037 | 0.343 |
| Expression | 3.33 | 3.17 | 3.58 | 4024.11 | 3.5 × 10−4 | 7.5 × 10−4 |
| GI | 220.04 | 211.73 | 2.35 | 3927.28 | 0.018 | 3.5 × 10−3 |
Mean number of nonsynonymous nucleotide substitutions per nonsynonymous site.
Mean number of synonymous nucleotide substitutions per synonymous site.
Mean number of genetic interactions.
Distribution of SNPs among GRDs and GRSs
| # SNPs | Total # Regions | % Regions | |
|---|---|---|---|
| GRSs | 1582 | 7208 | 21.95 |
| GRDs (WGDs) | 311 | 1192 | 26.1 |
| GRDs (SSDs) | 276 | 1219 | 22.64 |
F—Experimental evolution of S. cerevisiae reveals that mutational genome hotspots are traps for duplicates. We identified mutations in the coding region and 600-nucleotide genome regions upstream genes in S. cerevisiae and calculated the nonsynonymous nucleotide substitutions (dN) between S. cerevisiae and its close phylogenetic relative S. paradoxus for the six genes surrounding that gene with a SNP in its promoter. (a) Genes that have fixed a SNP in the promoter regions (black columns) in each of the five experimentally evolving line (MA1–MA5) exhibit significantly higher (* indicates P < 0.01; ** indicates P < 0.001) dN than those without SNPs (white boxes). (b) Genome regions that contain duplicates (GRDs) exhibit more SNPs than in the promoter (black columns) fixed during the evolution experiment of each line (MA1 to MA5) (* indicates P < 0.01; ** indicates P < 0.001) than genome regions that do not contain duplicates (GRSs) (white columns).
F—Duplicates in which one gene copy bears tandem repeat regions (trr) in its promoter share more functions than those without trr. (a) To determine the number and kind of functions of each gene copy of a duplicate (gene copies are presented as black and gray circles) we found the significant genetic interactions of that gene (symbolized as color squares) based on the functional chart of S. cerevisiae from a previous study (Costanzo et al. 2010). The number of shared functions between the two gene copies of a duplicate was calculated as twice the number of shared interactions divided by the sum of total genetic interactions of both of the gene copies. (b) We plotted the length of the repeat units in the promoter of duplicated genes against the percentage of shared interactions (GI) of the gene copies of a duplicate. Linear and curvilinear trend adjustments are shown as red and clue lines, respectively. Pearson correlation (ρ) and its probability are shown.
F—Duplicates exhibit high transcriptional plasticity under stress. (a) We compared the transcriptional plasticity of duplicates to singletons of S. cerevisiae growing under one of a set of 13 different stress conditions. RNA sequence data of eight of the conditions (Acidic stress, Alkaline, Wine fermentation, Heat stress, Lithium stress, Manganese stress, Osmotic stress, and Glucose limitation) were extracted from previously published data and which are available from the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/download-data/expression; last accessed May 4, 2017). RNA sequence data for five of the conditions (Ethanol stress, Lactic stress, Glycerol stress, Oxidative stress, and Oxidative stress with glucose) were obtained in the laboratory after growing populations of the yeast under these conditions. We calculated the increments in the expression levels (ΔE) by comparing the normalized RNA sequence counts in normal conditions (i) versus those for stress conditions (j). Increments of expression were calculated as . We considered identified genes with significant increments in expression and with at least 20% increments from the normal to the stress conditions. The proportion of genes with significant increments for Duplicates (D) and singletons (S) were estimated (color coded in the figure) and these were compared using a Fisher’s exact test (Probability plot). (b) For each of the 23 stress conditions, we calculated the percentage of cases in which the duplicated gene copy with repeats in its promoter regions (Drep) exhibited a larger increment in expression (ΔEi−j) than its paralog with no repeats in its promoter (black portions of the columns), and viceversa (D) (gray portions of the columns).