| Literature DB >> 29619042 |
Raúl A Ortiz-Merino1, Javier A Varela2, Aisling Y Coughlan1, Hisashi Hoshida3, Wendel B da Silveira4, Caroline Wilde5, Niels G A Kuijpers6, Jan-Maarten Geertman6, Kenneth H Wolfe1, John P Morrissey2.
Abstract
Kluyveromyces marxianus is traditionally associated with fermented dairy products, but can also be isolated from diverse non-dairy environments. Because of thermotolerance, rapid growth and other traits, many different strains are being developed for food and industrial applications but there is, as yet, little understanding of the genetic diversity or population genetics of this species. K. marxianus shows a high level of phenotypic variation but the only phenotype that has been clearly linked to a genetic polymorphism is lactose utilisation, which is controlled by variation in the LAC12 gene. The genomes of several strains have been sequenced in recent years and, in this study, we sequenced a further nine strains from different origins. Analysis of the Single Nucleotide Polymorphisms (SNPs) in 14 strains was carried out to examine genome structure and genetic diversity. SNP diversity in K. marxianus is relatively high, with up to 3% DNA sequence divergence between alleles. It was found that the isolates include haploid, diploid, and triploid strains, as shown by both SNP analysis and flow cytometry. Diploids and triploids contain long genomic tracts showing loss of heterozygosity (LOH). All six isolates from dairy environments were diploid or triploid, whereas 6 out 7 isolates from non-dairy environment were haploid. This also correlated with the presence of functional LAC12 alleles only in dairy haplotypes. The diploids were hybrids between a non-dairy and a dairy haplotype, whereas triploids included three copies of a dairy haplotype.Entities:
Keywords: Kluyveromyces; LAC12; dairy; industrial yeast; lactose transport; non-conventional yeast; yeast evolution
Year: 2018 PMID: 29619042 PMCID: PMC5871668 DOI: 10.3389/fgene.2018.00094
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Sources of K. marxianus strains and genomes analysed in this study.
| L01 | Unknown | Dairy | Lallemand Inc. | This study | University College Dublin (K.H. Wolfe) | ||
| L02 | Unknown | Dairy | Lallemand Inc. | This study | University College Dublin (K.H. Wolfe) | ||
| L03 | Unknown | Dairy | Lallemand Inc. | This study | University College Dublin (K.H. Wolfe) | ||
| L04 | Unknown | Baking | Lallemand Inc. | This study | University College Dublin (K.H. Wolfe) | ||
| L05 | Unknown | Distillery | Lallemand Inc. | This study | University College Dublin (K.H. Wolfe) | ||
| CBS397 | Netherlands | Yoghurt | Westerdijk Institute, Netherlands | This study | University College Cork (J.P. Morrissey) | ||
| NBRC0272 | Unknown | Miso | Biological Resource Center, NITE (NBRC), Japan | This study | Yamaguchi University (H. Hoshida) | ||
| NBRC0288 | DSM4906 | Unknown | Unknown | Biological Resource Center, NITE (NBRC), Japan | This study | Yamaguchi University (H. Hoshida) | |
| NBRC0617 | ATCC8622 | Denmark | Yoghurt | Biological Resource Center, NITE (NBRC), Japan | This study | Yamaguchi University (H. Hoshida) | |
| NBRC1777 | Japan | Soil | Biological Resource Center, NITE (NBRC), Japan | Inokuma et al., | Yamaguchi University (H. Hoshida) | ||
| CBS6556 | KCTC17555, ATCC26548 | Mexico | Pozol | Westerdijk Institute, Netherlands | Jeong et al., | Yonsei University (J. F. Kim) | |
| UFV-3 | CCT7735 | Brazil | Dairy | Universidade Federal de Viçosa, Brazil | Silveira et al., | BIOAGRO, Brazil (F. M. L. Passos) | |
| DMKU3-1042 | Thailand | Soil | Strain not obtained | Lertwattanasakul et al., | Yamaguchi University (H. Hoshida) | ||
| UFS-Y2791 | South Africa | Strain not obtained | Schabort et al., | Univ. of the Free State (D. T. W. P. Schabort) | |||
The reference genome sequence of NBRC1777 (Inokuma et al., .
Genome sequence obtained from a ura3 derivative generated by UV mutagenesis.
Summary of Illumina sequencing strategies and coverage for 14 strains used in SNP analysis.
| L01 | 47 | SE | 50 | 16.6 | aln samse |
| L02 | 47 | SE | 50 | 16.7 | aln samse |
| L03 | 45 | SE | 50 | 16.7 | aln samse |
| L04 | 45 | SE | 50 | 15.6 | aln samse |
| L05 | 47 | SE | 50 | 16.2 | aln samse |
| CBS397 | 141 | PE | 126 | 15.1 | mem |
| NBRC0272 | 160 | PE | 100 | 20.4 | mem |
| NBRC0288 | 212 | PE | 100 | 25.6 | mem |
| NBRC0617 | 35 | SE | 50 | 11.6 | aln samse |
| NBRC1777 | 110 | PE | 100 | 12.8 | mem |
| CBS6556 | 623 | PE | 70-150 | 56.9 | mem |
| UFV-3 | 359 | PE | 90 | 50.3 | mem |
| DMKU3-1042 | 341 | PE | 100 | 44.6 | mem |
| UFS-Y2791 | 50 | PE | 75-100 | 17.1 | mem |
SE, single-end; PE, paired-end.
SNPs identified in 14 K. marxianus strains.
| NBRC1777 | 248 | 1 | 249 |
| L05 | 611 | 33,326 | 33,937 |
| L04 | 726 | 35,138 | 35,864 |
| CBS6556 | 1,439 | 40,922 | 42,361 |
| DMKU3-1042 | 1,647 | 39,648 | 41,295 |
| UFS-Y2791 | 714 | 3,25,190 | 3,25,904 |
| L02 | 1,15,648 | 42,561 | 1,58,209 |
| L01 | 96,202 | 49,325 | 1,45,527 |
| CBS397 | 1,10,132 | 52,415 | 1,62,547 |
| NBRC0288 | 96,304 | 60,554 | 1,56,858 |
| NBRC0272 | 1,47,241 | 40,085 | 1,87,326 |
| NBRC0617 | 29,010 | 1,45,016 | 1,74,026 |
| L03 | 26,388 | 1,45,856 | 1,72,244 |
| UFV-3 | 27,347 | 1,77,365 | 2,04,712 |
Number of sites at which a variant (non-reference base) was present, at a frequency in the reads between 0.15 and 0.85.
Number of sites at which a variant (non-reference base) was present, at a frequency ≥ 0.85 in the reads.
Figure 1Variable ploidy in Kluyveromyces marxianus strains. Strain names are shown on the left. (A) Histograms of the alternative allele frequencies of variant (non-reference) bases, for SNPs designated as heterozygous (sites with alternative allele frequencies fB between 0.15 and 0.85). Histograms are coloured grey if at least 10% of the SNPs in a strain are heterozygous. Dashed vertical lines mark frequencies of 0.5 (purple), 0.33/0.66 (blue), and 0.25/0.75 (green). Bin sizes are 2% intervals. (B) Plots of alternative allele frequencies along the 8 chromosomes, for each strain. Horizontal dashed lines mark frequencies as in (A). Light and dark gray points indicate SNPs on different chromosomes. Red triangles mark the locations of centromeres, and the blue triangle marks the ribosomal DNA locus. Allele frequencies ≥ 0.85 are shown as 1. Alternative allele frequencies ≤ 0.15 are not shown. (C) Flow cytometry of DNA content. The Y-axis shows numbers of cells, and the X-axis shows SYTOX Green fluorescence signal intensity (arbitrary units) which is proportional to DNA content. Flow cytometry was not carried out for UFS-Y2791 and DMKU3-1042.
Figure 2Plots of sequence coverage in each strain. The Y-axis is log2 of the ratio between the observed and expected coverage, for 10-kb windows through the genome; a value of zero (dashed blue line) indicates no difference. Expected coverage is based on the average in the whole genome. Red lines show the segmental means for consecutive 10-kb windows calculated using the Bioconductor package DNAcopy. Cyan lines for NBRC0617 indicate the value expected for the 1.33-fold increase in coverage that would result from a fourth copy of a region in a triploid.
Figure 3Haplotype assignment in diploid strains. For each 1-kb window through the genome, dots show the number of variable sites in the window with genotypes AA (black dots), BB (red dots), or AB (blue dots). Each window was then classified as either heterozygous (yellow background), homozygous for the B haplotype (blue background), or homozygous for the reference A haplotype (grey background) following rules as described in Methods and Figure S1. The green bars along the bottom axis show the regions (18%) that were heterozygous in all diploids and used for the haplotype phylogeny analysis.
Figure 4Phylogenetic tree of K. marxianus haplotypes. Coloured dots indicate the ploidy of strains as haploid (cyan; 1 haplotype per strain), diploid (magenta; 2 haplotypes per strain), or triploid (yellow; 3 haplotypes per strain). The environmental source of each strain is shown (for diploid strains, the source is only labeled on the “B” haplotype). Orange backgrounds indicate Lac+ strains, grey backgrounds indicate Lac− strains, and white backgrounds indicate Lac phenotypes not tested. The tree was constructed from SNP data from regions that are heterozygous in diploids, totalling 18% of the genome. Inset, phylogenetic tree constructed from DNA sequences of GCN1 (an arbitrarily chosen large gene), confirming that UFS-Y2791 is a strain of K. marxianus.
Figure 5Growth and lactose consumption by K. marxianus strains. Growth in MM + 2 % lactose and lactose consumption are shown in black and white bars, respectively. Lactose consumption was calculated by subtracting the final and initial lactose concentrations in the medium.