| Literature DB >> 23786589 |
Vassiliki Koufopanou1, Jonathan Swire, Susan Lomas, Austin Burt.
Abstract
The Saccharomycetales or 'true yeasts' consist of more than 800 described species, including many of scientific, medical and commercial importance. Considerable progress has been made in determining the phylogenetic relationships of these species, largely based on rDNA sequences, but many nodes for early-diverging lineages cannot be resolved with rDNA alone. rDNA is also not ideal for delineating recently diverged species. From published full-genome sequence data, we have identified 14 regions of protein-coding genes that can be PCR-amplified in a large proportion of a diverse collection of 25 yeast species using degenerate primers. Phylogenetic analysis of the sequences thus obtained reveals a well-resolved phylogeny of the Saccharomycetales with many branches having high bootstrap support. Analysis of published sequences from the Saccharomyces paradoxus species complex shows that these protein-coding gene fragments are also informative about genealogical relationships amongst closely related strains. Our set of protein-coding gene fragments is therefore suitable for analysing both ancient and recent evolutionary relationships amongst yeasts.Entities:
Keywords: PCR primers; Saccharomycetales; phylogenetics
Mesh:
Substances:
Year: 2013 PMID: 23786589 PMCID: PMC3906836 DOI: 10.1111/1567-1364.12059
Source DB: PubMed Journal: FEMS Yeast Res ISSN: 1567-1356 Impact factor: 2.796
Species used for PCRs and sequencing
| Genes sequenced | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Clade | Code | CBS ID | Species | No. of gene fragments sequenced | SA | GI | PG | DE | ME | AT | GC | FS | PA | VM | EC | OL | L T 1 | L T 2 |
| A | C31 | 6740T | 9 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | |
| C38 | 7251T | 10 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | ||
| C17 | 2514T | 7 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | ||
| B | C30 | 6739T | 11 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | |
| C04 | 521.75T | 8 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 0 | ||
| C19 | 2594T | 10 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | ||
| C10 | 765.70T | 12 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | ||
| C09 | 749.85T | 7 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | ||
| C12 | 817.71T | 7 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | ||
| C01 | 179.60T | 11 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | ||
| D | C40 | 8139T | 4 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | |
| C37 | 7119T | 10 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | ||
| C32 | 6929T | 13 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | ||
| C39 | 8071T | 9 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | ||
| C20 | 4111T | 13 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | ||
| E | C21 | 4140T | 14 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
| C15 | 2286T | 12 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | ||
| F | C18 | 2555T | 10 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | |
| G | C34 | 6986T | 13 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
| C36 | 7111T | 11 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | ||
| C23 | 5456T | 13 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | ||
| C33 | 6940T | 11 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | ||
| C35 | 7023T | 11 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 0 | ||
| C02 | 254T | 11 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | ||
| H | C03 | 398T | 10 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | |
T – type strain.
Gene codes as in Table 2. 0 – not sequenced; 1 – sequenced. GenBank accession numbers KF042614-KF042824 and KF111756-KF111799.
Genes used for phylogenetic analyses
| Code | Gene | Description | Number of species | Length of alignment (amino acids) | No. of parsimony-informative characters | Min. possible no. of changes |
|---|---|---|---|---|---|---|
| SA | Homocysteine hydrolase | 38 | 279 | 84 | 271 | |
| GI | Phosphoglucose isomerase | 39 | 182 | 85 | 306 | |
| PG | Phosphoglycerate kinase | 36 | 268 | 97 | 391 | |
| DE | Formylglycinamidine ribonucleotide synthetase | 31 | 158 | 100 | 362 | |
| ME | Methionine synthase | 36 | 160 | 65 | 208 | |
| AT | ATP synthase | 40 | 223 | 45 | 160 | |
| GC | Translation initiation factor | 30 | 219 | 79 | 264 | |
| FS | Fatty acid synthetase | 36 | 121 | 61 | 230 | |
| PA | Pyruvate dehydrogenase | 35 | 125 | 51 | 159 | |
| VM | ATPase | 38 | 180 | 28 | 67 | |
| EC | Sulphite reductase | 37 | 197 | 117 | 484 | |
| OL | DNA polymerase | 36 | 150 | 83 | 327 | |
| LT1 | Glutamate synthase | 42 | 149 | 89 | 417 | |
| LT2 | Glutamate synthase | 32 | 176 | 103 | 369 | |
| 18S | 18S | 18S rRNA gene | 30 | 1993 | 336 | 906 |
Includes published genomic data from 18 species (17 for PGK1 and GLT1-2).
From GenBank; length of alignment measured in nucleotides, not amino acids.
Primers used to amplify the 14 gene fragments
| Name | Sequence | AAT (°C) |
|---|---|---|
| SA-1 | G CAC ATG ACC AT | 60 |
| SA-4 | CC GGT | |
| GI-1 | GAT TTG GG | 55 |
| GI-2 | TG TTG | |
| PG-1 | TC AG | 55 |
| PG-2 | GAA | |
| DE-1 | CAC GAT GT | 60 |
| DE-2 | GC AAC | |
| ME-1 | C GAT ATG GTT CA | 50 |
| ME-2 | AT GGA AA | |
| AT-1 | GCT ATG GA | 55 |
| AT-2 | AC GGC AGA | |
| GC-1 | ATC GG | – |
| GC-2 | GA ACC ACC | |
| FS-1 | CAA GA | 50 |
| FS-2 | TC ACC | |
| PA-1 | GGT AAG GGT GG | – |
| PA-2 | AC AGA CAT | |
| VM-1 | AAG G | 50 |
| VM-2 | GT CAT | |
| EC-1 | G | – |
| EC-2 | GG | |
| OL-1 | GAT GT | 50 |
| OL-2 | GC CAT TTC | |
| LT-1 | ATG GA | 55 |
| LT-4 | GG | |
| LT-5 | GCA CC | 50 |
| LT-6 | CA ATC |
Odd numbers indicate forward primers; even numbers indicate reverse.
I: inosine; otherwise IUPAC codes: R: A/G; W: A/T; M: C/A; K: T/G; Y: T/C; S: C/G.
AAT: alternative annealing temperature. All primers were tested first at 57 °C, and then, most were tested at a second temperature, indicated here.
Figure 1Phylogeny of 38 species of Saccharomycetales yeasts, plus five outgroups. Numbers above and below branches are bootstrap support statistics from likelihood and parsimony analyses, respectively. Branch lengths are proportional to inferred rates of evolutionary change. Thick lines in the phylogeny and lines and letters on the right indicate the main clades discussed in the text; arrows point to nodes showing the relationships amongst those clades.
Bootstrap support (when > 50%) for clades A-H in individual gene analyses
| Clade | ||||||||
|---|---|---|---|---|---|---|---|---|
| Gene | A | B | C | D | E | F | G | H |
| 67 | na | 79 | 69 | |||||
| 59 | 96 | na | ||||||
| na | ||||||||
| 62 | na | 77 | ||||||
| 77 | na | 87 | 57 | |||||
| na | ||||||||
| 98 | na | 95 | ||||||
| 91 | na | 97 | ||||||
| 89 | 74 | na | 96 | |||||
| na | ||||||||
| 89 | 73 | na | 97 | 80 | ||||
| 86 | na | 99 | 65 | |||||
| 94 | 68 | 65 | na | 80 | 93 | |||
| 97 | 56 | 60 | na | 88 |
Parsimony bootstrap support for each of clades A-H in separate analyses of each gene. Note that in many cases, sequence data were not available for all members of the clade, but support was counted if the bootstrap analysis supported a clade with all the species for which there were data. na: not applicable: our clade F consists of only a single species in our data set and so is found by definition in all individual trees.
Number of nucleotide sites in each of the gene fragments that are variable amongst strains of the Saccharomyces paradoxus species complex
| Gene | No. of strains | No. of sites | No. of variable sites |
|---|---|---|---|
| 16 | 845 | 21 | |
| 14 | 580 | 9 | |
| 16 | 634 | 5 | |
| 26 | 488 | 12 | |
| 12 | 482 | 7 | |
| 14 | 621 | 14 | |
| 19 | 665 | 16 | |
| 20 | 428 | 8 | |
| 19 | 451 | 23 | |
| 19 | 555 | 21 | |
| 17 | 664 | 12 | |
| 19 | 614 | 18 | |
| 14 | 557 | 21 | |
| 17 | 544 | 21 |
For PGI1, there were no sequences available from the Far East Asia lineage; for all other genes, there were sequences available from at least one strain in each of the three S. paradoxus lineages.