| Literature DB >> 27152522 |
Polina B Drozdova1,2, Oleg V Tarasov1,3, Andrew G Matveenko4,1,5, Elina A Radchenko1,2, Julia V Sopova4,6,1, Dmitrii E Polev7, Sergey G Inge-Vechtomov4,1, Pavel V Dobrynin8,2.
Abstract
The Peterhof genetic collection of Saccharomyces cerevisiae strains (PGC) is a large laboratory stock that has accumulated several thousands of strains for over than half a century. It originated independently of other common laboratory stocks from a distillery lineage (race XII). Several PGC strains have been extensively used in certain fields of yeast research but their genomes have not been thoroughly explored yet. Here we employed whole genome sequencing to characterize five selected PGC strains including one of the closest to the progenitor, 15V-P4, and several strains that have been used to study translation termination and prions in yeast (25-25-2V-P3982, 1B-D1606, 74-D694, and 6P-33G-D373). The genetic distance between the PGC progenitor and S288C is comparable to that between two geographically isolated populations. The PGC seems to be closer to two bakery strains than to S288C-related laboratory stocks or European wine strains. In genomes of the PGC strains, we found several loci which are absent from the S288C genome; 15V-P4 harbors a rare combination of the gene cluster characteristic for wine strains and the RTM1 cluster. We closely examined known and previously uncharacterized gene variants of particular strains and were able to establish the molecular basis for known phenotypes including phenylalanine auxotrophy, clumping behavior and galactose utilization. Finally, we made sequencing data and results of the analysis available for the yeast community. Our data widen the knowledge about genetic variation between Saccharomyces cerevisiae strains and can form the basis for planning future work in PGC-related strains and with PGC-derived alleles.Entities:
Mesh:
Year: 2016 PMID: 27152522 PMCID: PMC4859572 DOI: 10.1371/journal.pone.0154722
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
De novo assembly statistics.
| Strain | 15V-P4 | 25-25 | 1B | 74 | 6P-33G |
|---|---|---|---|---|---|
| Number of contigs | 1,165 | 891 | 480 | 1,514 | 3,039 |
| Largest contig, bases | 86,918 | 122,204 | 252,839 | 71,636 | 22,054 |
| Total length, bases | 11,666,974 | 11,614,012 | 11,567,449 | 11,330,585 | 10,013,534 |
| N50 | 19,288 | 26,385 | 72,884 | 11,948 | 4,341 |
| S288C genome fraction, % | 91.87 | 92.78 | 94.17 | 92.53 | 76.01 |
| Median S288C genome coverage | 19x | 22x | 35x | 42x | 16x |
| Number of S288C genes found (%) | 5,115 (87%) | 5,000 (85%) | 4,993 (84%) | 5,058 (86%) | 4,004 (68%) |
| Complete CEGMA core genes found (%) | 241 (97%) | 240 (97%) | 243 (98%) | 243 (98%) | 210 (85%) |
| Partial CEGMA core genes found (%) | 244 (98%) | 241 (97%) | 243 (98%) | 243 (98%) | 226 (91%) |
* Calculated with Quast.
** Calculated as one-to-one orthologs with ProteinOrtho clustering.
Fig 115V-P4 position in the phylogeny of S. cerevisiae strains.
(A) Neighbor joining phylogenetic tree of 95 strains including 15V-P4 inferred from alignment of conservative chromosome regions. (B) Phylogenetic tree of 29 strains including 15V-P4 inferred from sequences of 807 common genes under the GTR+G model and tested with 500 bootstrap replicates. Branch bootstrap values greater than 95 are indicated. In both trees, strain names are colored according to functional origin. Grey circles highlight either the population group (A) or common functional origin (B). Branch lengths are given in the same scale on both trees. PGC, the Peterhof genetic collection.
Genes absent from the S288C genome but found in 15V-P4.
| Gene name | Protein function | S288C | 15V-P4 |
|---|---|---|---|
| Amidase | No | Yes | |
| Killer toxin | No | Yes | |
| Lipid exporter | No | Yes | |
| Hypothetical zinc finger transcription factor | No | Yes | |
| Invertase | ≥ 2 | ||
| Wine12 | 5-oxo-L-prolinase | No | Pseudogene? |
| Wine23 | Nicotinic acid permease | No | Yes |
| Wine34 | Flocculin | No | Yes |
| Wine45 | Transcription factor | No | Yes |
| Wine56 | Transcription factor | No | Yes |
Yes/No denotes presence/absence of the corresponding gene.
* Members of the SUC family, see S2 Table.
Fig 2Genome coverage across reference for representative strains.
(A) 15V-P4, (B) 25-25, (C) 6P-33G. Dashed lines signify chromosome borders.
Lengths of regions annotated as amplified or deleted in each strain and counts of genes included into each of these regions.
| Strain | 15V-P4 | 25-25 | 1B | 74 | 6P-33G |
|---|---|---|---|---|---|
| Total length of amplified regions, bp | 458,407 | 856,981 | 474,369 | 613,763 | 1,054,510 |
| Total number of amplified genes | 179 | 392 | 141 | 159 | 499 |
| Total length of deleted regions, bp | 84,475 | 53,117 | 39,722 | 24,334 | 36,913 |
| Total number of deleted genes | 20 | 9 | 5 | 6 | 24 |
Fig 3Distribution of variable sites shown in chromosomal coordinates of S288C.
Green: SNVs compared to S288C. Purple: SNVs compared to 15V-P4. Each chromosome is framed.
Selectable marker mutations in the PGC strains.
| Allele | Known change | Variations detected in this study | Strains |
|---|---|---|---|
| G732A (→TGA) [ | G732A (→TGA) | 25-25, 74, 1B | |
| Nonsense ( | T422A + G423A (→TAA), C1517T (→P506L) | 6P-33G | |
| Deletion (-205 to +172) [ | Deletion (-205 to +172) | 74 | |
| A229T (→TAA) [ | A229T (→TAA) | 25-25, 6P-33G, 1B | |
| Unidentified non-suppressible mutation | G748A (→D250N) | 25-25 | |
| 249insG, 792insG [ | 249insG, 792insG | 74, 1B | |
| Nonsense mutation, TGA [ | G3465A (→TGA) | 25-25 | |
| T605A (→TAA) [ | T605A (→TAA) | 6P-33G, 1B | |
| Nonsense mutation [ | A481T (→TAA) | 6P-33G | |
| Nonsense mutation, TGA [ | A1180T (→TGA) | 25-25 | |
| C403T (→TAG) [ | C403T (→TAG) | 6P-33G, 74, 1B | |
| Complete deletion of | Deletion (-188 to +76) | 25-25 | |
| Ty1 insertion (transcribing left to right) at 121 [ | Ty insertion at 121 | 6P-33G, 74, 1B |
Nucleotide positions in 5’ UTR are preceded with the minus sign while those in 3’ UTR with the plus sign; numbers indicate distance from the beginning or the end of the ORF, respectively. Stop codon type or amino acid substitution are indicated after an arrow for mutations that must lead to known auxotrophic phenotypes.
* Only differences from the corresponding wild type alleles are listed. For complete list of substitutions, see S1 Appendix.
** Synonymous designations.
*** Assigned to the PHA2 locus in this work.
**** Includes duplication of insertion flanking sequence (GTACC).
Fig 4Only PHA2P but not pha2 mutant alleles compensate for the pheA10 phenylalanine auxotrophy.
33G-D373 was transformed with plasmids bearing indicated PHA2 alleles. Series of 5-fold dilutions on synthetic media are shown. Vector, pRS316.
Fig 5Cell aggregation phenotypes of strains analyzed correlate with AMN1 and FLO8 alleles.
The scale bar indicates 10 um. Amn1 and Flo8 variants are shown in color (green: associated with “clumping” phenotypes; red and purple: “non-clumping”). Representative microphotographs out of five fields of view of yeast liquid medium cultures in early stationary phase are shown.
Yeast strains used in this work.
| Name | Known genotype | References |
|---|---|---|
| 15V-P4 | [ | |
| 25-25-2V-P3982 | [ | |
| 6P-33G-D373 Asp+ (6P-33G) | [ | |
| 6P-33G-D373 Asp- | [ | |
| 33G-D373 | [ | |
| 74-D694 | [ | |
| P-74-D694 | K. Volkov, unpublished; [ | |
| 1B-D1606 | [ | |
| S1 (isogenic to S288C) | [ |
* The full name of the strain is 25-25-dU8-132-L28-2V-P3982.
** PmSUP35, Pichia methanolica SUP35.
*** A [PSI+] derivative of 74-D694. In addition, other [PSI+] derivatives of 74-D694 (OT55 and OT56) [61, 88] and their [psi-] isolates were used for the clumping experiments.
**** gal10-1B designates the gal10 allele from 1B-D1606 discovered in this work.