Robert Kourist1, Felix Bracharz2, Jan Lorenzen2, Octavia N Kracht3, Mansi Chovatia4, Chris Daum4, Shweta Deshpande4, Anna Lipzen4, Matt Nolan4, Robin A Ohm5, Igor V Grigoriev4, Sheng Sun6, Joseph Heitman6, Thomas Brück7, Minou Nowrousian8. 1. Junior Research Group for Microbial Biotechnology, Ruhr-Universität Bochum, Bochum, Germany robert.kourist@rub.de brueck@tum.de minou.nowrousian@rub.de. 2. Fachgebiet Industrielle Biokatalyse, Technische Universität München, Garching, Germany. 3. Junior Research Group for Microbial Biotechnology, Ruhr-Universität Bochum, Bochum, Germany. 4. U.S. Department of Energy Joint Genome Institute, Walnut Creek, California, USA. 5. U.S. Department of Energy Joint Genome Institute, Walnut Creek, California, USA Department of Microbiology, Utrecht University, Utrecht, The Netherlands. 6. Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina, USA. 7. Fachgebiet Industrielle Biokatalyse, Technische Universität München, Garching, Germany robert.kourist@rub.de brueck@tum.de minou.nowrousian@rub.de. 8. Lehrstuhl für Allgemeine und Molekulare Botanik, Ruhr-Universität Bochum, Bochum, Germany robert.kourist@rub.de brueck@tum.de minou.nowrousian@rub.de.
Abstract
UNLABELLED: Microbial fermentation of agro-industrial waste holds great potential for reducing the environmental impact associated with the production of lipids for industrial purposes from plant biomass. However, the chemical complexity of many residues currently prevents efficient conversion into lipids, creating a high demand for strains with the ability to utilize all energy-rich components of agricultural residues. Here, we present results of genome and transcriptome analyses of Trichosporon oleaginosus. This oil-accumulating yeast is able to grow on a wide variety of substrates, including pentoses and N-acetylglucosamine, making it an interesting candidate for biotechnological applications. Transcriptomics shows specific changes in gene expression patterns under lipid-accumulating conditions. Furthermore, gene content and expression analyses indicate that T. oleaginosus is well-adapted for the utilization of chitin-rich biomass. We also focused on the T. oleaginosus mating type, because this species is a member of the Tremellomycetes, a group that has been intensively analyzed as a model for the evolution of sexual development, the best-studied member being Cryptococcus neoformans. The structure of the T. oleaginosus mating-type regions differs significantly from that of other Tremellomycetes and reveals a new evolutionary trajectory paradigm. Comparative analysis shows that recruitment of developmental genes to the ancestral tetrapolar mating-type loci occurred independently in the Trichosporon and Cryptococcus lineages, supporting the hypothesis of a trend toward larger mating-type regions in fungi. IMPORTANCE: Finite fossil fuel resources pose sustainability challenges to society and industry. Microbial oils are a sustainable feedstock for biofuel and chemical production that does not compete with food production. We describe genome and transcriptome analyses of the oleaginous yeast Trichosporon oleaginosus, which can accumulate up to 70% of its dry weight as lipids. In contrast to conventional yeasts, this organism not only shows an absence of diauxic effect while fermenting hexoses and pentoses but also effectively utilizes xylose and N-acetylglucosamine, which are building blocks of lignocellulose and chitin, respectively. Transcriptome analysis revealed metabolic networks that govern conversion of xylose or N-acetylglucosamine as well as lipid accumulation. These data form the basis for a targeted strain optimization strategy. Furthermore, analysis of the mating type of T. oleaginosus supports the hypothesis of a trend toward larger mating-type regions in fungi, similar to the evolution of sex chromosomes in animals and plants.
UNLABELLED: Microbial fermentation of agro-industrial waste holds great potential for reducing the environmental impact associated with the production of lipids for industrial purposes from plant biomass. However, the chemical complexity of many residues currently prevents efficient conversion into lipids, creating a high demand for strains with the ability to utilize all energy-rich components of agricultural residues. Here, we present results of genome and transcriptome analyses of Trichosporon oleaginosus. This oil-accumulating yeast is able to grow on a wide variety of substrates, including pentoses and N-acetylglucosamine, making it an interesting candidate for biotechnological applications. Transcriptomics shows specific changes in gene expression patterns under lipid-accumulating conditions. Furthermore, gene content and expression analyses indicate that T. oleaginosus is well-adapted for the utilization of chitin-rich biomass. We also focused on the T. oleaginosus mating type, because this species is a member of the Tremellomycetes, a group that has been intensively analyzed as a model for the evolution of sexual development, the best-studied member being Cryptococcus neoformans. The structure of the T. oleaginosus mating-type regions differs significantly from that of other Tremellomycetes and reveals a new evolutionary trajectory paradigm. Comparative analysis shows that recruitment of developmental genes to the ancestral tetrapolar mating-type loci occurred independently in the Trichosporon and Cryptococcus lineages, supporting the hypothesis of a trend toward larger mating-type regions in fungi. IMPORTANCE: Finite fossil fuel resources pose sustainability challenges to society and industry. Microbial oils are a sustainable feedstock for biofuel and chemical production that does not compete with food production. We describe genome and transcriptome analyses of the oleaginous yeastTrichosporon oleaginosus, which can accumulate up to 70% of its dry weight as lipids. In contrast to conventional yeasts, this organism not only shows an absence of diauxic effect while fermenting hexoses and pentoses but also effectively utilizes xylose and N-acetylglucosamine, which are building blocks of lignocellulose and chitin, respectively. Transcriptome analysis revealed metabolic networks that govern conversion of xylose or N-acetylglucosamine as well as lipid accumulation. These data form the basis for a targeted strain optimization strategy. Furthermore, analysis of the mating type of T. oleaginosus supports the hypothesis of a trend toward larger mating-type regions in fungi, similar to the evolution of sex chromosomes in animals and plants.
The foreseeable end of fossil resources, associated with significant price fluctuations, drives the development of sustainable biomass-based processes in the chemical industry sector. Plant-derived oils are already essential feedstocks in the production of bio-based specialty and commodity chemical building blocks (1, 2). However, their material utilization competes with food production and has an increasing impact on land use changes (3, 4). An emerging alternative is the fermentative conversion of agro-industrial residues for production of microbe-derived lipids.Oleaginous yeasts are promising microbial factories for sustainable lipid production, as they can accumulate between 40 and 70% of their biomass as intracellular storage triglycerides when primary nutrients are limited (5). In particular, these organisms can metabolize pentose and hexosesugars from complex biomass hydrolysates and switch on lipogenesis when metabolic stresses, such as nitrogen and phosphorus deprivation, are applied. The metabolic adaptability of oleaginous yeasts to various carbon sources is of interest for second-generation bioprocess engineering approaches (2). While the exact fatty acid profile of accumulated triglycerides depends on cultivation conditions, the principal chemical composition resembles those of plant-derived lipids, such as palm oil. Interestingly, oleaginous yeasts belong to diverse taxonomic groups, indicating that the metabolic capacity for lipid accumulation has evolved independently in basidiomycete (6) and ascomycete fungi (7–9).At present, nitrogen limitation has been identified as the strongest inducer of lipogenesis in oleaginous organisms. Biochemical and metabolic engineering studies have identified malic enzyme (ME) (10–12), diacylglycerol acyltransferase 1 (DGA1) (13), and acetyl coenzyme A (CoA) carboxylase (13) as metabolic regulators of lipogenesis under N stress conditions for selected fungi and microalgae. However, recent systems biology studies in different oleaginous organisms could not identify a common metabolic mechanism for lipid accumulation under N-limiting conditions (7, 14). Furthermore, metabolic mechanisms governing lipogenesis under P-limiting conditions have not yet been explored. Hence, the overall metabolic networks and regulatory mechanisms that drive lipid accumulation in different oleaginous organisms remain elusive (2).Here, we present genome and transcriptome analysis results for the oil-accumulating yeastTrichosporon oleaginosus strain IBC0246. This Trichosporon strain is distinct from conventional yeasts, as it can simultaneously metabolize both pentose and hexosesugars without any catabolic preference. Its capacity to metabolize N-acetylglucosamine (NAG) and the ability to accumulate 70% of its dry weight as lipid droplets make it an interesting candidate for biotechnological applications. Sequence identity of the internal transcribed spacer (ITS) sequences shared with two previously characterized T. oleaginosus strains (15) identified strain IBC0246 as Trichosporon oleaginosus. Trichosporon species are basidiomycetes that belong to the order Tremellales. While nonpathogenic Trichosporon species have been isolated from environmental soil and milk whey samples, some isolates have recently come to attention as pathogens of immunocompromised hosts. Trichosporon species mostly grow as yeasts, but many are able to also form pseudohyphae or true hyphae and arthroconidia (16, 17). A sexual cycle and ploidy level have not been described for Trichosporon species.Strain IBC0246 is able to utilize a variety of carbon sources without showing a diauxic lag in growth (unpublished data), making it an interesting candidate for biotechnological applications. Interestingly, it can efficiently utilize pentoses such as xylose (XYL) and NAG, which are major monomeric sugar building blocks of lignocellulosic and chitin-rich biomass residues, respectively (18). Because XYL and NAG utilization is challenging for conventional industrial yeasts (19), Trichosporon oleaginosus is poised to be developed as a next-generation microbial production platform for the de novo biosynthesis of sustainable oleochemicals. In this study, we therefore focused on lipid accumulation and the possible utilization of the two biomass-derived sugarsXYL and NAG. We show that in response to nitrogen starvation this yeast is capable of utilizing xylose as a carbon source for lipid accumulation, and we highlight several pathways of oleaginicity. Transcriptome analysis of a sample using NAG as the sole carbon source showed that NAG is efficiently channeled toward the primary metabolism.Furthermore, we analyzed the mating-type structure of T. oleaginosus, because this species is a member of the Tremellomycetes, which in recent years have become model organisms for the study of the evolution of sexual development. A well-studied member of this group is the pathogenic yeastCryptococcus neoformans. The structure of the T. oleaginosus mating-type regions differs significantly from that of the Cryptococcus species complex. Comparative analysis suggests that it has undergone independent evolution shaped by a similar trend toward larger mating-type regions, as found in other Tremellomycetes (20), but in which orthologous genes functioning in sexual development (MYO2, STE11, IKS1, STE20) have been differentially incorporated into either one or the other of the ancestral tetrapolar MAT loci in the two lineages. This reveals a heretofore-unknown level of plasticity in expansion of sex-determining regions in fungi, with implications for sex chromosome evolution in animals and plants.
RESULTS AND DISCUSSION
Genome sequencing of Trichosporon oleaginosus strain IBC0246 reveals great evolutionary flexibility in mating-type configurations in Tremellomycetes.
The genome of T. oleaginosus strain IBC0246 was sequenced as part of the 1000 Fungal Genomes project (http://1000.fungalgenomes.org) (21) and assembled into 180 scaffolds (Table 1). With an assembly size of 19.8 Mb, it is in a size range similar to the genomes of the pathogenic Tremellomycetes Cryptococcus neoformans and Cryptococcus gattii (18 to 19 Mb) and the opportunistic pathogen Trichosporon asahii (24 to 25 Mb) (22–26). Evidence-based annotation using transcriptome sequencing (RNA-seq) data (see below) identified about 8,300 protein-coding genes (see Table S1 in the supplemental material), which is similar to T. asahii (8,300 to 8,500 protein-coding genes) and somewhat more than the 6,500 to 6,900 predicted genes in Cryptococcus species. Seventy-three percent of all predicted proteins showed similarities to proteins from the Swissprot database, and 83% had a KEGG annotation (Table 1). The genome sequence and all annotations are available via the MycoCosm portal (http://genome.jgi.doe.gov/Triol1) (27).
Table 1
Main features of the T. oleaginosus genome
Assembly statistic
Value
Assembly length (Mbp)
19.8
Contig length total (Mbp)
19.8
No. of contigs
196
Contig N50 (bp)
34
Contig L50 (bp)
190,830
No. of scaffolds
180
Scaffold N50 (bp)
29
Scaffold L50 (bp)
216,041
No. of scaffold gaps
16
Length of scaffolds gaps (bp)
4,451
% of scaffolds in gaps
0.02
No. of repeat-covered regions
4,916
Length of repeat-covered regions (bp)
564,416
% of assembly covered by repeats
2.85
GC content (%)
60.75
Gene statistics
No. of genes
8,322
Protein length (median no. of amino acids)
367
Exon length (median bp)
238
Gene length (median bp)
1,500
Transcript length (median bp)
1,309
Intron length (median bp)
39
No. of genes with intron
7,010
% of genes with intron
84.23
No. of introns/gapped gene (median)
3
Functional annotations
No. (%) of genes with KEGG annotation
6,869 (82.54)
No. (%) of genes with KOG annotation
5,937 (71.34)
No. (%) of genes with Swissprot hit
6,088 (73.16)
No. (%) of genes with Pfam domain
4,833 (58.07)
No. of genes with transmembrane domain
1,519 (18.25)
No. of genes with signal peptide
1,873 (22.51)
No. of unique Pfam domains
2,248
Annotation completeness (CEGMA) (%)
99.13
Assembly statistics are based on 1 N to denote a gap.
Main features of the T. oleaginosus genomeAssembly statistics are based on 1 N to denote a gap.Repeat analyses based on similarity to known repeat classes as well as de novo repeat finding showed that in T. oleaginosus repeats longer than 200 bp constitute only 0.3% of the genome. The total repeat content of the genome, consisting mostly of small repeats, is 2.85%, which is in a range similar to that of C. neoformans (5% repeats) (23).To analyze synteny between T. oleaginosus and other Tremellomycetes, we used the PROmer algorithm from the MUMmer package (28); however, even a comparison with the closest sequenced relative, T. asahii, yielded only a few aligned regions (data not shown), probably due to large evolutionary distances even within the genus Trichosporon. Therefore, an orthology-based approach was chosen, which entailed identification of putative orthologous proteins between T. oleaginosus and other species by reciprocal BLAST analyses and plotting the genomic positions of orthologous genes. Between 4,587 and 5,048 putative orthologs were identified in the comparisons of T. oleaginosus with T. asahii, Tremella mesenterica, and the two Cryptococcus species, whereas the comparison of C. neoformans and C. gattii that was included as a control yielded 5,604 orthologs. Plotting of genomic positions of orthologous pairs showed a high degree of synteny between the two Cryptococcus species, as expected, but no large-scale synteny for any of the T. oleaginosus comparisons (Fig. 1). This also included the comparison to the sister species T. asahii, confirming the preliminary PROmer-based results and indicating a larger evolutionary distance between the two Trichosporon species than the distance between C. neoformans and C. gattii.
FIG 1
Phylogenetic and synteny analyses. (A) Phylogenetic tree of species in the class Tremellomycetes based on analysis of 200 conserved single-copy genes. Bootstrap values (percentages) are indicated at the corresponding branches. Coprinopsis cinerea was used as an outgroup. The scale bar indicates the number of amino acid substitutions per site. (B) Genome-wide synteny between T. oleaginosus and other Tremellomycetes. Positions of orthologous genes were determined along the concatenated scaffolds from each species and are visualized as dot plots for the pairs T. oleaginosus/T. asahii and T. oleaginosus/C. neoformans. The analysis of C. neoformans versus C. gattii was included for comparison. Scaffolds were not specifically ordered for this analysis; therefore, there is a seemingly somewhat-random organization for the C. neoformans/C. gattii plot. (C) Syntenic gene pairs and triplets in T. oleaginosus compared to four other Tremellomycetes. Numbers of pairs and triplets of orthologous genes within 20 kb (pairs) and 40 kb (triplets) are shown as blue and orange bars, respectively. On the right, the same comparison for the two Cryptococcus species, C. neoformans and C. gattii, is shown. Note that the y axis is interrupted in two places for better visualization. BioProject numbers and references for genomes used for comparison are as follows: T. asahii PRJNA164647 (26), C. neoformans PRJNA411 (24), C. gattii PRJNA62089 (22), T. mesenterica PRJNA225529 (75).
Phylogenetic and synteny analyses. (A) Phylogenetic tree of species in the class Tremellomycetes based on analysis of 200 conserved single-copy genes. Bootstrap values (percentages) are indicated at the corresponding branches. Coprinopsis cinerea was used as an outgroup. The scale bar indicates the number of amino acid substitutions per site. (B) Genome-wide synteny between T. oleaginosus and other Tremellomycetes. Positions of orthologous genes were determined along the concatenated scaffolds from each species and are visualized as dot plots for the pairs T. oleaginosus/T. asahii and T. oleaginosus/C. neoformans. The analysis of C. neoformans versus C. gattii was included for comparison. Scaffolds were not specifically ordered for this analysis; therefore, there is a seemingly somewhat-random organization for the C. neoformans/C. gattii plot. (C) Syntenic gene pairs and triplets in T. oleaginosus compared to four other Tremellomycetes. Numbers of pairs and triplets of orthologous genes within 20 kb (pairs) and 40 kb (triplets) are shown as blue and orange bars, respectively. On the right, the same comparison for the two Cryptococcus species, C. neoformans and C. gattii, is shown. Note that the y axis is interrupted in two places for better visualization. BioProject numbers and references for genomes used for comparison are as follows: T. asahii PRJNA164647 (26), C. neoformans PRJNA411 (24), C. gattii PRJNA62089 (22), T. mesenterica PRJNA225529 (75).While our analysis did not show any macrosynteny between T. oleaginosus and other sequenced genomes, an analysis of syntenic gene pairs and triplets showed considerable microsynteny (Fig. 1). Nearly 1,100 syntenic gene pairs and 500 syntenic gene triplets were identified between T. oleaginosus and T. asahii, and about 700 and 400 syntenic pairs and triplets, respectively, were identified in comparisons of T. oleaginosus with T. mesenterica or the Cryptococcus species. However, these results are still much lower than those from a comparison of C. neoformans and C. gattii, again indicating a high degree of chromosomal rearrangements in T. oleaginosus and thus a large evolutionary distance compared to other sequenced genomes.One major feature of interest in Tremellomycete genomes is the organization and evolution of mating-type regions (20). Sexual development in basidiomycetes is governed by mating-type genes encoding homeodomain transcription factors, pheromones, and pheromone receptors. These mating-type genes can be organized as one mating-type locus containing all genes or as two mating-type loci, one of which contains the transcription factor genes (HD locus) and the other that contains the pheromone and pheromone receptor genes (P/R locus). In the majority of basidiomycetes, the HD locus comprises at least two genes coding for homeodomain proteins of class 1 (HD1) and class 2 (HD2) (29, 30). This is the case in the Tremellomycetes Cryptococcus amylolentus, Kwoniella heveanensis, Kwoniella mangrovensis, and Tremella mesenterica, which harbor the basal arrangement of two separate mating-type loci, with two HD genes of classes HD1 and HD2 within the transcription factor-containing locus (31–33). An exception to this rule is found in C. neoformans, where the single mating-type locus contains only one or the other HD gene (34, 35). This HD transcription factor gene is clustered with pheromone precursor and receptor genes and additional genes involved in mating within a genomic region of >100 kb. With the exception of a gene conversion hot spot (36), this large mating-type region of C. neoformans displays restricted meiotic recombination and might represent an evolutionary trend from small mating regions to mating/sex chromosomes that is also seen in other groups, e.g., animals, plants, and algae (20, 34, 37, 38).Trichosporon is a sister lineage to the Tremella/Kwoniella/Cryptococcus lineage within the Tremellomycetes (Fig. 1). To learn more about the mating system of T. oleaginosus, we searched for homologs to proteins encoded by the well-studied mating-type locus of C. neoformans. In this fungus, strains harboring the HD1 gene SXI1 at the mating-type locus are designated MATα, whereas strains harboring the HD2 gene SXI2 are MATa (35). While for almost all MAT-encoded proteins, including Sxi1, putative homologs could be found in T. oleaginosus, no homologs were found for the HD2 transcription factor Sxi2 (see Table S2 and Fig. S1 and S2 in the supplemental material). Searches with HD proteins from other Tremellomycetes in the predicted T. oleaginosus proteins as well as tBLASTn searches in the genome assembly did not yield any putative Sxi2 homologs either. This was further confirmed by searches in the genomes of the T. asahii type strain (26) (see Table S2) and a T. asahii environmental strain (25) (data not shown), indicating that the currently available Trichosporon genomes do not encode an HD2 transcription factor. Homologs for the pheromone genes were also not found; however, this is not unusual, as pheromones are small peptides that harbor only short conserved motifs. Searches specifically for conserved pheromone motifs within the genome assemblies of T. oleaginosus and T. asahii as well as in the unmapped T. oleaginosus Illumina reads yielded several putative pheromone gene candidates (see Fig. S3 in the supplemental material). Whether these candidates actually encode pheromone precursors remains to be elucidated.The T. oleaginosus homologs encoding putative mating-associated proteins are located on several different scaffolds, making a mating chromosome-like arrangement as found in C. neoformans unlikely (see Table S2 in the supplemental material). Importantly, the HD transcription factor homolog SXI1 and the pheromone receptor gene STE3 are found on scaffolds 70 and 68, respectively, suggesting two independent mating-type loci harboring transcription factor and pheromone receptor genes, respectively (Fig. 2). Thus, with respect to the genomic organization of the mating-type genes at two different loci, T. oleaginosus is similar to other Tremellomycetes, with the exception of the pathogenic cryptococci, whereas the presence of only one HD transcription factor gene in T. oleaginosus resembles the derived organization of C. neoformans/C. gattii. However, as the last common ancestor of the Tremellomycetes most likely contained the “standard” two-HD gene arrangement (20, 31) and the genus Trichosporon branches at the base of the Tremellomycetes lineage (Fig. 1), the loss of one HD gene might represent a case of parallel evolution in Trichosporon and C. neoformans/C. gattii.
FIG 2
T. oleaginosus scaffolds 2, 68, and 70 contain homologs to genes from the C. neoformans mating-type locus. Characteristic mating-type genes are shown in red, and other genes that are part of the mating-type locus or flank the mating-type locus (gene g6341) of C. neoformans are shown in green. Genes unrelated to the mating-type locus are shown in gray. Genes are indicated only in those parts of the scaffolds that contained putative mating-type genes (the left part of scaffold 70 and the right parts of scaffolds 2 and 68). For comparison, homologous regions from the T. asahii type strain (26) are shown. In this species, the corresponding regions are present on two scaffolds (left part of scaffold JH977550 and middle part of scaffold 977583). Genes associated with SXI1 (HD locus) in panel A, genes associated with STE3 (P/R locus) in panel B.
T. oleaginosus scaffolds 2, 68, and 70 contain homologs to genes from the C. neoformans mating-type locus. Characteristic mating-type genes are shown in red, and other genes that are part of the mating-type locus or flank the mating-type locus (gene g6341) of C. neoformans are shown in green. Genes unrelated to the mating-type locus are shown in gray. Genes are indicated only in those parts of the scaffolds that contained putative mating-type genes (the left part of scaffold 70 and the right parts of scaffolds 2 and 68). For comparison, homologous regions from the T. asahii type strain (26) are shown. In this species, the corresponding regions are present on two scaffolds (left part of scaffold JH977550 and middle part of scaffold 977583). Genes associated with SXI1 (HD locus) in panel A, genes associated with STE3 (P/R locus) in panel B.In C. neoformans, C. gattii, C. amylolentus, K. heveanensis, K. mangrovensis, and T. mesenterica, several genes besides HD transcription factors, pheromones, and receptors were recruited into the mating-type loci, representing a trend toward larger mating regions culminating in the fusion of both mating-type loci into one large locus in C. neoformans/C. gattii (20, 31–33). Homologs of several of these genes are also found associated with the HD transcription factor or STE3, respectively, in T. oleaginosus (Fig. 2); however, the distribution of these genes between the two loci is distinct from that of the other Tremellomycetes. In T. mesenterica and K. heveanensis, the developmental genes STE11, STE12, and STE20, as well as several other genes (MYO2, IKS1), are linked to the P/R locus, whereas the essential gene RPL22 is linked to the HD locus (33). In T. oleaginosus, a similar linkage of RPL22 with the HD locus can be found, but STE11, STE20, MYO2, and IKS1 are also linked to the HD locus instead of the P/R locus (Fig. 2). STE12 of T. oleaginosus is located on a scaffold different from both the HD and P/R loci, but in the genome assembly of T. asahii (26) STE12 is present on the same scaffold as the putative pheromone receptor gene STE3 (Fig. 2). A relatively high degree of synteny of the genomic region(s) containing STE3 and STE12 in T. asahii and T. oleaginosus and the finding that the STE3 and STE12 regions are located at the ends of their respective scaffolds in T. oleaginosus suggest that these scaffolds might be linked in T. oleaginosus, too (Fig. 2B). Furthermore, the linkage of STE11, STE20, MYO2, and IKS1 to the HD locus was also found in T. asahii, indicating that this genomic arrangement is not an assembly artifact in T. oleaginosus (Fig. 2A).Based on the Tremellomycetes phylogeny, the last common ancestor of Trichosporon and the Cryptococcus/Kwoniella/Tremella group is inferred to have been tetrapolar (20) (Fig. 1). Therefore, one hypothesis to explain the different distributions of mating-associated genes around SXI1 and STE3 would be that the recruiting of developmental genes like STE11, STE12, and STE20 into the mating-type regions occurred independently in Trichosporon and the sister groups, leading to different sets of genes becoming linked to the HD locus and the P/R locus, respectively (Fig. 3). Genes without a developmental function might have ended up linked to the mating-type loci as a consequence of recombination events involving larger genomic regions. Another possible hypothesis would be that the last common ancestor already had the Cryptococcus-like single mating region and that this region was broken up several times independently in the descendants. However, this hypothesis is more complex and less parsimonious than the first, because it would require (independent, but similar) genomic rearrangements in essentially every lineage except Cryptococcus. The most parsimonious assumption of an independent recruitment event for one set of developmental genes into the HD locus or P/R locus would be consistent with an evolutionary trend toward larger mating-type regions (20). For a detailed discussion about possible causes/consequences, see Text S1 in the supplemental material.
FIG 3
Model for the evolution of mating-type loci in Tremellomycetes. Genes at the MAT loci containing homeodomain transcription factor genes (HD locus) or pheromone and receptor genes (P/R locus) are shown in red and blue, respectively. Genes involved in sexual development but not originally part of a MAT locus are shown in green; other genes are shown in gray. Only genes from the C. neoformans MAT locus that are also linked to the core MAT genes (STE3, HD genes) in the Trichosporon lineage are shown (STE11, STE12, STE20, IKS1, MYO2, and RPL22); other genes present at the MAT loci were left out for the sake of clarity. A trend toward integrating other developmental genes into the MAT loci is reflected in the recruitment of the STE12 gene into the P/R locus and the subsequent recruitment of an ancestral STE11/STE20 cluster into the P/R locus in the Cryptococcus/Kwoniella/Tremella lineage and into the HD locus in the Trichosporon lineage. In Cryptococcus, both MAT loci were subsequently fused. The loss of one HD gene at the HD-containing locus occurred independently in Cryptococcus and Trichosporon. Gene names are given according to naming in the C. neoformans MAT locus. A putative mating pheromone gene, MFA, in the Trichosporon lineage is shown in outline only in order to indicate that no such gene has been definitively identified in the Trichosporon lineage. The phylogenetic relationships are depicted according to descriptions in reference 20 and the illustrations in Fig. 1.
Model for the evolution of mating-type loci in Tremellomycetes. Genes at the MAT loci containing homeodomain transcription factor genes (HD locus) or pheromone and receptor genes (P/R locus) are shown in red and blue, respectively. Genes involved in sexual development but not originally part of a MAT locus are shown in green; other genes are shown in gray. Only genes from the C. neoformans MAT locus that are also linked to the core MAT genes (STE3, HD genes) in the Trichosporon lineage are shown (STE11, STE12, STE20, IKS1, MYO2, and RPL22); other genes present at the MAT loci were left out for the sake of clarity. A trend toward integrating other developmental genes into the MAT loci is reflected in the recruitment of the STE12 gene into the P/R locus and the subsequent recruitment of an ancestral STE11/STE20 cluster into the P/R locus in the Cryptococcus/Kwoniella/Tremella lineage and into the HD locus in the Trichosporon lineage. In Cryptococcus, both MAT loci were subsequently fused. The loss of one HD gene at the HD-containing locus occurred independently in Cryptococcus and Trichosporon. Gene names are given according to naming in the C. neoformans MAT locus. A putative mating pheromone gene, MFA, in the Trichosporon lineage is shown in outline only in order to indicate that no such gene has been definitively identified in the Trichosporon lineage. The phylogenetic relationships are depicted according to descriptions in reference 20 and the illustrations in Fig. 1.The presence of mating-type genes, and thus the possibility of a sexual cycle, is not only of interest for the analysis of the evolution of sex but also for potential development of T. oleaginosus for biotechnological applications, because genetically tractable organisms allow for easier development of molecular tools. In addition, detailed knowledge about cellular adaptations to growth conditions relevant for biotechnological applications also facilitates strain development, and therefore we analyzed several T. oleaginosus transcriptomes, as described in the next sections.
Transcriptome analysis in the presence of different carbon sources.
Transcriptome analyses were performed under different cultivation conditions (see Text S1 in the supplemental material). Cultivation in complete medium (YPD) was compared with growth on the two alternative carbon sources, xylose (XYL) and NAG. To study the oil accumulation under relevant conditions for biomass utilization, we used xylose as carbon source under conditions of nitrogen limitation (NLM). While nitrogen limitation efficiently induces lipid synthesis, utilization of N-rich biomass, such as that derived from chitin, poses a challenge with this strategy (39). Therefore, we also investigated the possibility of inducing lipid synthesis with phosphate limitation (PLM).Between 26 million and 100 million RNA-seq reads were obtained for each independent biological replicate (see Text S1 in the supplemental material). Of the predicted 8,322 genes, 8,256 (99.2%) were expressed under at least one condition (see Table S1 in the supplemental material). An analysis of the genes that were among the 500 most strongly expressed under each of the five growth conditions (top 500 analysis) (see Table S1) revealed several genes that were highly expressed under all cultivation conditions. These encoded mostly ribosomal proteins, translation initiation and elongation factors, histones, actin, thioredoxin, proteases, and enzymes from basic metabolism. The latter included enzymes from the tricarboxylic acid (TCA) cycle, mitochondrial respiration, the pentose-phosphate pathway, glycolysis, amino acid metabolism, glycogen metabolism, and cell wall synthesis and degradation.
Transcriptome analysis under lipid-accumulating conditions.
Under nonlimiting cultivation conditions, analysis of total lipids showed a maximum lipid formation of 500 mg/liter, or 4% (wt/wt) of the total biomass with any carbon source. The lipid fraction contained palmitic acid (C16:0), palmitoleic acid (C16:1), stearic acid (C18:0), oleic acid (C18:1), linoleic acid (C18:2), and linolenic acid (C18:3) as their main constituents (see Fig. S4 in the supplemental material). Lowering the concentration of ammonium sulfate from 4 g/liter to 1.2 mg/liter in the NLM sample led to an increased lipid yield of 5 g/liter and a lipid content as high as 50% (wt/wt) (see Fig. S4). In contrast, lowering the phosphate concentration in the PLM sample cohort from 0.4 mM to 0.14 mM, which significantly increases lipid synthesis in Cryptococcus curvatus (39), had only marginal effects in our experiments, resulting in 15% (wt/wt) lipid content with respect to the dried biomass (see Fig. S4).The transcriptome from cultivation under nitrogen starvation with xylose as sole carbon source (NLM) was compared to that in YPD. A comparison to cultivation using a xylose-based medium with an abundant nitrogen source (XYL) served as an additional control. The automated annotation of core metabolism genes was manually confirmed (see Table S3 in the supplemental material), and we compared expression patterns of selected genes between different growth conditions. The transcriptome response under nitrogen limitation differed from that under all other conditions, indicating a different physiology during lipid assimilation (see Fig. S4 in the supplemental material). Differences in growth conditions also had a significant effect on the fatty acid profiles. For all data sets, oleic acid (C18:1) was the main component of the T. oleaginosus-derived triglyceride fraction (see Fig. S4). Interestingly, in comparison to other growth conditions, the amount of linoleic acid (C18:2) was 60 to 80% decreased under N-limiting growth conditions. The biochemical basis for this shift is yet unclear.The manually annotated genes were used to reconstruct the central lipid metabolism in T. oleaginosus (Fig. 4; see also Table S3 in the supplemental material). Lipid metabolism requires an increased supply of NADPH for fatty acid biosynthesis. The most important producers of NADPH are malic enzyme (ME, Triol1|285398) and NADP+-dependent isocitrate dehydrogenase (IDP1, Triol1|288361), as well as glucose-6-phosphate dehydrogenase (G6PDH, Triol1|300435) and 6-phosphogluconate dehydrogenase (GND1, Triol1|307888) from the pentose-phosphate cycle. ME belongs to the top 500 transcripts under nitrogen and phosphate limitation, but not under control cultivation conditions (see Table S1 in the supplemental material). ME was 2-fold upregulated in the comparison of NLM versus XYL medium (see Table S3). However, in comparison to YPD, no upregulation was found. This is consistent with R. toruloides data (14), where ME is downregulated under nitrogen starvation. Interestingly, overexpression of ME increases lipid synthesis in several organisms 3- to 4-fold (10, 11). Similar to R. toruloides, G6PDH was upregulated under nitrogen starvation. Transcript levels of IDP1 and GND1 decreased (see Table S3). As T. oleaginosus was cultivated on the pentose xylose as the carbon source, involvement of the oxidative part of the pentose-phosphate shunt pathway was somewhat unexpected; however, an explanation might be the need to generate NADPH for lipid biosynthesis.
FIG 4
Lipid biosynthesis in T. oleaginosus. (A) Core lipid biosynthesis enzymes. Corresponding genes that are upregulated during growth on NLM versus YPD are marked in red (for details, see Table S3 in the supplemental material). FAS, fatty acid synthase. (B) Expression of core lipid biosynthesis genes under different growth conditions. Expression was analyzed by RNA-seq (three independent biological replicates). Genes that were significantly differentially expressed are indicated with a black star (adjusted P value, ≤0.05) or a white star (adjusted P value, ≤0.1).
Lipid biosynthesis in T. oleaginosus. (A) Core lipid biosynthesis enzymes. Corresponding genes that are upregulated during growth on NLM versus YPD are marked in red (for details, see Table S3 in the supplemental material). FAS, fatty acid synthase. (B) Expression of core lipid biosynthesis genes under different growth conditions. Expression was analyzed by RNA-seq (three independent biological replicates). Genes that were significantly differentially expressed are indicated with a black star (adjusted P value, ≤0.05) or a white star (adjusted P value, ≤0.1).Fatty acid synthase (FAS) initiates lipid synthesis by forming acyl-CoA from acetyl-CoA, malonyl-CoA, and NADPH. FAS in fungi can consist of one or two subunits; the latter is the case for Tremellomycetes, where two genes under the control of a divergent promoter encode the FAS complex (14, 40–43). Both subunits are also present and encoded by divergently transcribed genes in T. oleaginosus (FAS1, Triol1|330866; FAS2, Triol1|279681) (see Fig. S5 in the supplemental material). ATP:citratelyase (ACL) generates acetyl-CoA from citrate, and acetyl-CoA carboxylase (ACC) further produces malonyl-CoA for fatty acid synthesis. FAS1, FAS2, ACC, and ACL are significantly upregulated under nitrogen limitation but not under phosphate limitation, which reflects the differences in lipid accumulation (Fig. 4). While these genes were also significantly overexpressed in R. toruloides (14), in the ascomycete Y. lipolytica neither of them displayed a significant change in transcription level under nitrogen starvation (7).Later-stage biosynthesis of neutral lipids and membranes involves transport of CoA-bound fatty acids into peroxisomes, desaturation and elongation reactions, and finally transfer of the acyl moiety to the glycerol backbone. It is expected that cells grown in limited medium also respond in terms of their lipid composition. The two genes for fatty acid desaturases, Triol1|289557 and Triol1|308253, are upregulated in comparison with NLM/XYL but not NLM/YPD, and both desaturase genes are highly transcribed. The first belongs to the top 500 transcripts under NLM, PLM, and YPD conditions, the second of the top 500 transcripts that are shared under all cultivation conditions. In Y. lipolytica, one of the fatty acid desaturases also showed a higher expression level under nitrogen starvation (7). 1-Acyl-sn-glycerol-3-phosphate acyltransferase (Triol1|249914), which is involved in neutral lipid synthesis, is upregulated in NLM compared to XYL medium. Serine palmitoyltransferase (Triol1|95840), the first and committed step for sphingolipid synthesis, was not upregulated.In R. toruloides, several lipid-degrading enzymes were upregulated under lipid-accumulating conditions, including lipases and the glyoxylate pathway enzymes isocitrate lyase and malate synthase. This was attributed to the presence of free fatty acids due to an elevated autophagy process (14). In T. oleaginosus, isocitrate lyase (Triol1|282916) is upregulated in NLM compared to XYL medium, and malate synthase (Triol1|288256) is upregulated in NLM compared to both YPD and XYL media (see Table S3 in the supplemental material), suggesting a similar mechanism. Acyl-CoA synthetases initiate the degradation of free fatty acids. The 13 putative acyl-CoA synthetases in T. oleaginosus show distinct patterns of regulation. Under nitrogen starvation, two are upregulated while three are downregulated (see Table S3), indicating a highly differentiated lipid metabolism in this species.It is interesting that in several fungi, including C. neoformans, sexual development is induced by starvation, and specifically by nitrogen limitation (44–49). In T. oleaginosus, the putative mating-type transcription factor gene SXI1 is upregulated under nitrogen limitation and growth with NAG as carbon source compared to YPD. This is also the case for the two putative developmental genes STE11 and STE20, which are linked to SXI1 on the same scaffold (see Fig. S1 in the supplemental material), suggesting that a similar connection between nitrogen starvation and sexual development might be present in T. oleaginosus. Mating-type genes of other species have already been shown to regulate not only sexual development but also other processes, e.g., pathogenicity in the basidiomycetes C. neoformans and Ustilago maydis and antibiotic production in the ascomycete Penicillium chrysogenum (50–55). Sexual development has not yet been observed in T. oleaginosus, because currently only one mating-type configuration is known (see Text S1 in the supplemental material); however, relationships between mating-type-dependent regulation, developmental events, and lipid accumulation will be of interest for future applications.
Nitrogen metabolism.
As described above, T. oleaginosus accumulates lipids under nitrogen starvation. Besides metabolic pathways directly involved in fatty acid biosynthesis, this condition also leads to changes in nitrogen metabolism itself that can be important factors in biotechnological applications. Amino acid transporters and permeases play an important role for nitrogen metabolism, and a total of 66 corresponding genes were identified in T. oleaginosus (see Table S3 in the supplemental material). Under nitrogen starvation, 32 were found to be upregulated, 11 were downregulated, and 23 did not change significantly. Under phosphate-limiting conditions, 19 were upregulated and 17 were downregulated, whereas the majority (30) were not significantly changed compared to growth in YPD. Zhu et al. also found upregulation of many amino acid transporters in R. toruloides under nitrogen-limiting conditions (14). This was confirmed for Y. lipolytica by Morin et al. (7), who compared expression levels between stages of biomass production and lipid accumulation. The downregulation of some amino acid transporters in minimal medium compared to YPD in T. oleaginosus might be a reaction to the smaller amount of peptides and amino acids in the minimal medium.An important strategy for cells to adapt to nitrogen scarcity is the degradation of nonessential proteins to peptides and ultimately amino acids, which are then again available for protein biosynthesis. We found that six proteases were upregulated and another six were downregulated during nitrogen limitation.The source of nitrogen plays an important role for the accumulation of lipids. In R. toruloides, it was found that organic nitrogen compounds such as urate or urea led to a higher accumulation of lipids than observed with inorganic sources (e.g., NH4Cl) (56). Based on data from C. neoformans (57), the urate catabolic pathway was annotated in T. oleaginosus (Fig. 5; see also Table S3 in the supplemental material). In C. neoformans, the gene encoding allantoinase, DAL1, and the urate oxidase gene URO1 are induced in the presence of uric acid (58). In T. oleaginosus, we found a 5.6-fold upregulation of URO1 (Triol1|306408) but a downregulation of all subsequent steps of urate utilization, including URO2 (Triol1|317784), URO3 (Triol1|309759), and DAL1 (Triol1|311323) under nitrogen-limited conditions compared to full medium. In contrast, the majority of allantoin permeases were found to be upregulated, presumably to facilitate the import of remaining nitrogen sources. We also found a strong upregulation of the urea transporter gene DUR3 (Triol1|161533) and the putative ammonia transporter genes (Triol1|302913 and Triol1|69276), all of which are also among the 500 most strongly expressed genes under nitrogen-limited conditions (see Tables S1 and S3 in the supplemental material). Furthermore, the gene for the urease URE1 was upregulated. Urease acts downstream of DUR3 to convert urea into ammonium for central nitrogen metabolism (Fig. 5).
FIG 5
Key enzymes of central and peripheral nitrogen metabolism in T. oleaginosus. Corresponding genes that are upregulated during growth on NLM versus YPD are marked in red, and downregulated genes are shown in blue (for details, see Table S3 in the supplemental material). HIU, hydroxyisourate; OHCU, 2-2-oxo-4-hydroxy-4-carboxy-5-ureidoimidazoline.
Key enzymes of central and peripheral nitrogen metabolism in T. oleaginosus. Corresponding genes that are upregulated during growth on NLM versus YPD are marked in red, and downregulated genes are shown in blue (for details, see Table S3 in the supplemental material). HIU, hydroxyisourate; OHCU, 2-2-oxo-4-hydroxy-4-carboxy-5-ureidoimidazoline.A central element of the response to nitrogen limitation is the conversion of ammonium to glutamate and glutamine (Fig. 5). We found a strong upregulation of glutamate synthase (GOGAT; Triol1|329209), while glutamine synthetase 2 (GLN2; Triol1|283213), which catalyzes the reverse reaction, was downregulated. No change was found for glutamate dehydrogenase 1 (GDH1; Triol1|286063), while the enzyme for the reverse reaction, GDH2 (Triol1|299402), was downregulated. These transcript changes suggest a shift toward the synthesis of glutamate, presumably in an effort to make the remaining nitrogen available for protein biosynthesis (Fig. 5).The response in T. oleaginosus differs from that observed in R. toruloides, where GDH1, GDH2, and GLN1 were upregulated under nitrogen starvation (14), and from Y. lipolytica, where GLN1 was also upregulated (7). However, in both fungi, genes encoding transporters for nitrogen-containing compounds like ammonia and amino acids were upregulated, similar to T. oleaginosus, suggesting that parts of the response to nitrogen limitation are similar in these fungi but distribution of nitrogen-based metabolites toward downstream pathways may differ.
Transcriptome analysis using alternative carbon sources.
Beside cellulose, chitin is the most abundant polysaccharide in nature. It is a major component in the cell walls of fungi and widespread in other groups, such as insects and crustaceans. Here, we have used the chitin monomer NAG as the sole carbon source to analyze the resulting changes in transcription in T. oleaginosus. NAG can be channeled into glycolysis via fructose 6-phosphate (Fru-6P) or into chitin biosynthesis, with both pathways requiring phosphorylation of NAG first (Fig. 6). A predicted NAG kinase with homology to the C. albicans NAG kinase CaNAG5 (59) is present in T. oleaginosus (Triol1|249892) and upregulated during growth on NAG (Fig. 6; see also Table S3 in the supplemental material). Enzymes involved in chitin biosynthesis downstream of phosphorylated NAG are present in T. oleaginosus, including seven chitin synthase genes that represent the chitin synthase classes I to V, similar to the situation in C. neoformans (60), but these genes are not consistently upregulated during growth on NAG (see Table S3). In contrast, genes encoding proteins required for conversion to Fru-6P (N-acetyl-glucosamine-6-phosphate deacetylase [Triol1|281629] and glucosamine-6-phosphate deaminase [Triol1|281628]) are 180- and 1,000-fold upregulated, respectively. Both genes were among the 500 most strongly expressed genes in the NAG samples, but not in other samples (Fig. 6; see also Tables S1 and S3), reflecting the large extent to which T. oleaginosus channels NAG into its primary metabolism. Interestingly, genes for glycolysis are not strongly regulated, while genes playing a role in the tricarboxylic acid cycle and respiration are mostly downregulated during growth on NAG (see Table S3).
FIG 6
NAG is mainly channeled into primary metabolism in T. oleaginosus. Pathways were constructed according to the predicted C. neoformans pathways in the KEGG database (77) and the published data for C. albicans and C. neoformans (59, 60, 62). Genes that are upregulated during growth on NAG are marked in red, and downregulated genes are shown in blue (for details, see Table S3 in the supplemental material).
NAG is mainly channeled into primary metabolism in T. oleaginosus. Pathways were constructed according to the predicted C. neoformans pathways in the KEGG database (77) and the published data for C. albicans and C. neoformans (59, 60, 62). Genes that are upregulated during growth on NAG are marked in red, and downregulated genes are shown in blue (for details, see Table S3 in the supplemental material).For use of chitin as a carbon source, chitinolytic enzymes are required for breakdown of the polymer into oligomeric and monomeric units (61). T. oleaginosus encodes three putative chitinases and two putative β-N-acetylhexosaminidases (see Table S3 in the supplemental material), similar to C. neoformans, which encodes four and one enzyme, respectively (62). Two of the chitinase genes are downregulated during growth on NAG, probably reflecting feedback inhibition, while one β-N-acetylhexosaminidase gene is upregulated and the other downregulated (see Table S3). In summary, the pronounced upregulation of genes involved in channeling NAG into Fru-6P, and thus into primary metabolism, together with the large number of proteases and amino acid transporters (see above) suggest that T. oleaginosus is well-adapted to protein- and chitin-rich environments.
Conclusions.
Here, we have analyzed the genome of the lipid-accumulating basidiomycete yeast T. oleaginosus as well as transcriptomes under various physiological conditions. A key finding was the lineage-specific arrangement of mating-type genes, indicating that recruitment of developmental genes to the ancestral tetrapolar mating-type loci occurred independently in the Trichosporon and Cryptococcus lineages, supporting the hypothesis of a general trend toward larger mating-type regions in fungi, similar to the evolution of sex chromosomes in animals and plants. Furthermore, the transcriptome data showed specific metabolic shifts within lipid and nitrogen metabolism under nitrogen limitation, concurrent with the accumulation of fatty acids under this condition. We also showed that N-acetylglucosamine is efficiently shuttled into primary metabolism, supporting the ability of T. oleaginosus to grow on a variety of substrates, including chitin-rich ones. The genome and transcriptome data are valuable resources for future biotechnological applications using T. oleaginosus and for the analysis of the evolution of sex in eukaryotes.
MATERIALS AND METHODS
Strains and growth conditions.
T. oleaginosus strains IBC0246 (from the laboratory collection of the culture collection of the research group Industrial Biocatalysis, TU, Munich), ATCC 20508 and ATCC 20509 were maintained on YPD (see Text S1 in the supplemental material). For transcriptome analysis, T. oleaginosus was grown on YPD (complete medium), NLM (nitrogen limitation), PLM (phosphate limitation), XYL (xylose as carbon source), or NAG (N-acetyl glucosamine as carbon source); media compositions are provided in Text S1.
Analysis of total lipid content.
For the determination of lipid content, cells were lysed using high-pressure homogenization and lipids were extracted according to the methods of Bligh and Dyer (63). The solvent phase was then evaporated under nitrogen, and the dry weight of the remaining lipids was determined.
Analysis of lipid profiles under different growth conditions.
T. oleaginosus was grown in shake flask cultures (50 ml) for 6 days at 28°C and 250 rpm in the different media (see Text S1 in the supplemental material). The biomass was subsequently harvested and prepared for lipid extraction. The direct transesterification of wet biomass was performed according to a modified protocol of Griffiths et al. (64) with the following modifications: we replaced the C17 tag with a C12 tag, we replaced BF3methanol with an HCl-methanol solution, and we omitted the C19-ME. Subsequently, the resulting fatty acid methyl esters (FAME) extract was injected into a Shimadzu GC-2010 Plus system equipped with a flame ionization detector and Zebron ZB-WAX column (30 m by 0.32 mm [inner diameter], 0.25-µm film thickness; Phenomenex, USA). Standard split/splitless injection was used, with a split ratio of 10 and an injector temperature of 240°C. The column temperature was increased from 150°C to 240°C at 5°C/min. Nitrogen (3 ml/min) was used as the carrier gas, and the detector temperature was 245°C. Peaks were identified by retention time using the FAMEs marine oil standard (Restek). Peak areas were used to quantify each FAME relative to the internal standards.
Genome sequencing and assembly.
Genomic DNA was prepared as described in Text S1 in the supplemental material. One hundred nanograms of genomic DNA was sheared to 270 bp by using the Covaris E210 apparatus and size selected by using SPRI beads (Beckman Coulter). The fragments were treated with end repair, A-tailing, and ligation of Illumina compatible adapters (IDT, Inc.) by using the KAPA Illumina library creation kit (KAPA Biosystems). Quantitative PCR was used to determine the concentrations of the libraries. Libraries were sequenced as paired ends with a read length of 150 bp on the Illumina HiSeq system. The resulting fastq files were quality-control filtered to separate mitochondrial data and remove artifact/process contamination, and the information was initially assembled using the Velvet assembler (65). The Velvet assembly was used to simulate long mate-pair libraries, the first with inserts of 2,000 ± 50 bp, the second with inserts of 5,000 ± 50 bp. These two software-constructed libraries were then assembled together with the original Illumina library by using AllPathsLG release version R47710 (66) to produce a 19.8-Mb assembly in 221 contigs and 180 scaffolds with a 115.8× read depth coverage (Table 1). K-mer analysis of the filtered sequence reads suggested that the genome is haploid (data not shown).
Transcriptome sequencing and quantitative analysis of gene expression.
Five different growth conditions (YPD, NLM, PLIM, XYL, and NAG) were analyzed with two (YPD) or three (all others) independent biological replicates (see Text S1 in the supplemental material). Total RNA was prepared as described in Text S1. Stranded cDNA libraries were generated using the Illumina Truseq stranded RNA LT kit (for details, see Text S1). Paired-end sequencing was performed using an Illumina HiSeq2000 instrument, generating 2 150-bp or 2 157-bp reads from each library (see Text S1).Raw fastq file reads were filtered and trimmed using the JGI QC pipeline, resulting in the filtered fastq file. Using BBDuk (https://sourceforge.net/projects/bbmap/), raw reads were evaluated for artifact sequence by kmer matching (kmer value, 25), allowing 1 mismatch, and detected artifacts were trimmed from the 3′ end of the reads. RNA spike-in reads, PhiX reads, and reads containing any Ns were removed. Quality trimming was performed using the Phred trimming method set at Q6. Finally, following trimming, reads under the length threshold were removed (minimum length, 1/3 of the original read length). For de novo transcriptome assembly, which was used for annotation (see below), 432,000,000 paired-end 150-bp Illumina HiSeq-2000 reads of stranded RNA-seq data (10 libraries) were used as input for Rnnotator v3.3.1 (67). For quantitative analysis of gene expression, reads from each library were aligned to the reference genome by using TopHat (68). DESeq2 (69) was used to determine which genes were differentially expressed between pairs of conditions. For an analysis of the 500 most strongly expressed genes under each condition (Top 500 analysis), RPKM (reads per kilobase per million mapped reads) values were calculated for each gene and condition. For details, see Text S1 in the supplemental material.
Genome annotation.
The genome assembly of T. oleaginosus was annotated using the JGI annotation pipeline (70), which combines several gene prediction and annotation methods and integrates the annotated genome into the Web-based fungal resource MycoCosm (27) for comparative genomics. Details are provided in Text S1 in the supplemental material.
Synteny analysis.
An orthology-based analysis of synteny was performed by determining orthologs for all T. oleaginosus proteins in the predicted proteomes of three other Tremellomycetes by reciprocal BLAST analysis (71) and using custom-made Perl scripts based on BioPerl modules (72) to determine the positions of orthologous proteins on sequenced scaffolds or chromosomes.
Phylogenetic analyses.
Multiple alignments were created in ClustalX (73) and trimmed with Jalview (74), and the same alignment was used for analysis by distance matrix (DM), maximum parsimony (MP), or Bayesian methods (for details, see Text S1 in the supplemental material). For generating a species tree of Tremellomycetes, the genomes of T. asahii (26), T. mesenterica (75), C. neoformans (24), C. gattii (22), and Coprinopsis cinerea (76) were used for comparative purposes (details are given in Text S1 in the supplemental material).
Accession numbers.
The genome sequence and annotation are publicly available via the JGI MycoCosm portal (http://genome.jgi.doe.gov/Triol1) (27). The BioProject accession number for the genome is PRJNA239490. The sequence reads that were used for the assembly of the T. oleaginosus genome were submitted to the NCBI sequence read archive (accession number SRX873336) and the assembly was submitted to GenBank (accession number JZUH00000000.1). The BioProject accession number for the transcriptome is PRJNA250808. Sequences for the sxi1 and ste3 regions of strains ATCC 20508 and ATCC 20509 and the ITS region of strain IBC0246 were submitted to GenBank (NCBI) under the following accession numbers, respectively: KM821409 (sxi1 ATCC 20508), KM821410 (sxi1 ATCC 20509), KM821407 (ste3 ATCC 20508), KM821408 (ste3 ATCC 20509), and KM821406 (ITS region IBC0246).Predicted protein-coding genes of Trichosporon oleaginosus (sheet 1), and genes that are among the 500 genes with the strongest expression under one or more condition (sheet 2).Table S1, XLSX file, 2.9 MBBLASTP analysis results for Cryptococcus neoformans var. neoformans mating-type proteins versus all predicted proteins from T. oleaginosus and T. asahiiTable S2, XLSX file, 0.02 MBGenes involved in lipid metabolism and related pathways.Table S3, XLSX file, 0.1 MBSupplemental materials and methods, with additional information about mating-type analyses. DownloadText S1, PDF file, 0.1 MBPhylogenetic analysis and expression of SXI1 and STE3. (A) Phylogenetic analysis of the HD transcription factor Sxi1. (Left) Neighbor-joining phylogenetic tree shows bootstrap percentages from 1,000 bootstrap replicates at the branches. (Right) Phylogenetic tree calculated with MrBayes, with Bayesian probabilities given at the branches. In both trees, HD transcription factor sequences cluster in groups according to HD class. (B) Multiple alignments of the homeodomains from Sxi1 homologs (class HD1 homeodomain transcription factors) from T. oleaginosus and other basidiomycetes. The three conserved helices are underlined in black. A 3-amino-acid extension characteristic for the HD1 homeodomain factors (in contrast to the HD2 homeodomains) is underlined in green. The conserved DNA binding motif WFXNXR within helix III is indicated. Labeling of helices and motifs according to Kües et al. (The Mycota XIV, p 97–160, Springer, Berlin, Germany, 2011). Species: T. o., Trichosporon oleaginosus; T. a., Trichosporon asahii; C. n., Cryptococcus neoformans; C. g., Cryptococcus gattii; K. h., Kwoniella heveanensis; T. m., Tremella mesenterica; C. c., Coprinopsis cinerea; C. s., Coprinopsis scobicola. (C) Phylogenetic analysis of Ste3 proteins from Tremellomycetes. Neighbor-joining and maximum parsimony result in the same tree topology, with bootstrap percentages from 1,000 bootstrap replicates given at the branches. (Top) Neighbor joining; (bottom) maximum parsimony. Ste3 sequences cluster in groups according to mating-type specificity. S. cerevisiae Ste3 was used as an outgroup. (D) Expression of MAT genes SXI1 and STE3 and putative developmental genes linked to the MAT genes. Log2 values of ratios of gene expression under different growth conditions compared to growth on full medium (YPD) are shown. Padj, P values that were adjusted for multiple testing. Gene expression values that were significantly upregulated are shown in red, and other values are shown in gray. DownloadFigure S1, PDF file, 0.2 MBMultiple alignment of Ste3 homologs. A conserved proline residue that is present in the MATα group proteins is indicated in red; it is not present in the MATa-specific Ste3 proteins from Tremellomycetes, which include T. oleaginosus Ste3. In the constitutively active pheromone receptor-like Cpr2 protein, this residue is changed to leucine (Y. P. Hsueh, C. Xue, and J. Heitman, EMBO J 28:1220–1233, 2009). However, no homolog to Cpr2 was found in T. oleaginosus. DownloadFigure S2, PDF file, 0.3 MBSearch for putative pheromone genes. (A) Multiple alignments of (putative) pheromones from Tremellomycetes. The pheromone Tra10 from T. mesenterica is known only in its mature peptide form. The green arrowhead indicates the N-terminal amino acid (glutamic acid [E]) of the T. mesenterica Tra13 pheromone after processing. The T. oleaginosus peptide encoded on scaffold 68 lacks this conserved N-terminal amino acid of the putative processed pheromone, whereas the peptide encoded on scaffold 11 is not linked to STE3 or other mating-type genes (see panel B), as is usual with pheromone precursor genes in other basidiomycetes. (B) Genomic localization of putative pheromone genes from Trichosporon species. Both T. asahii putative pheromone genes and the T. oleaginosus putative pheromone gene on scaffold 68 are on the same scaffolds as the respective STE3 genes, whereas the T. oleaginosus putative pheromone gene on scaffold 11 is not close to any predicted mating-type gene. Species abbreviations: C. g., Cryptococcus gattii; C. n., Cryptococcus neoformans; K. h., Kwoniella heveanensis; T. a., Trichosporon asahii; T. m., Tremella mesenterica; T. o., Trichosporon oleaginosus. DownloadFigure S3, PDF file, 0.2 MBLipid accumulation and fatty acid profiles of T. oleaginosus grown in different media and expression of metabolism genes under different growth conditions. (A to C) Cells were grown for 2 and 5 days in three independent replicates for each time point. Lipids (A) and biomass (B) were measured, and the relative lipid content (C) was calculated as the percentage of lipids per biomass unit. Error bars give standard deviations (A and B) or the coefficient of variation (C). (D) Fatty acid profiles of the lipid fraction grown under different nutrient conditions. PLM, phosphate limitation with xylose as carbon source; NLM, nitrogen starvation with xylose as carbon source. (E) Expression of core carbon and nitrogen metabolism genes. A total of 211 genes that were manually annotated as being involved in core carbon or nitrogen metabolism and were expressed under all analyzed conditions were included in the analysis. Hierarchical clustering and heat map generation of the log2 fold ratios were performed in the R program. Details for the genes and annotations can be found in Table S3. DownloadFigure S4, PDF file, 0.2 MBFatty acid synthase (FAS) genes and proteins of Tremellomycetes. (A) Two T. oleaginosus subunits of FAS are encoded by two adjacent, divergently transcribed genes on scaffold 88. Arrows in the top row indicate coding regions (introns not shown). The two derived proteins comprise the eight essential FAS subunits (bottom row). (B) Genomic location of FAS-encoding genes in several Tremellomycetes. In all cases, FAS is encoded by two adjacent genes that are divergently transcribed. The corresponding peptides have the same domain organization as the T. oleaginosus proteins shown in panel A. DownloadFigure S5, PDF file, 0.2 MB
Authors: Simon Jenni; Marc Leibundgut; Daniel Boehringer; Christian Frick; Bohdan Mikolásek; Nenad Ban Journal: Science Date: 2007-04-13 Impact factor: 47.728
Authors: Andrew M Waterhouse; James B Procter; David M A Martin; Michèle Clamp; Geoffrey J Barton Journal: Bioinformatics Date: 2009-01-16 Impact factor: 6.937
Authors: Guilhem Janbon; Kate L Ormerod; Damien Paulet; Edmond J Byrnes; Vikas Yadav; Gautam Chatterjee; Nandita Mullapudi; Chung-Chau Hon; R Blake Billmyre; François Brunel; Yong-Sun Bahn; Weidong Chen; Yuan Chen; Eve W L Chow; Jean-Yves Coppée; Anna Floyd-Averette; Claude Gaillardin; Kimberly J Gerik; Jonathan Goldberg; Sara Gonzalez-Hilarion; Sharvari Gujja; Joyce L Hamlin; Yen-Ping Hsueh; Giuseppe Ianiri; Steven Jones; Chinnappa D Kodira; Lukasz Kozubowski; Woei Lam; Marco Marra; Larry D Mesner; Piotr A Mieczkowski; Frédérique Moyrand; Kirsten Nielsen; Caroline Proux; Tristan Rossignol; Jacqueline E Schein; Sheng Sun; Carolin Wollschlaeger; Ian A Wood; Qiandong Zeng; Cécile Neuvéglise; Carol S Newlon; John R Perfect; Jennifer K Lodge; Alexander Idnurm; Jason E Stajich; James W Kronstad; Kaustuv Sanyal; Joseph Heitman; James A Fraser; Christina A Cuomo; Fred S Dietrich Journal: PLoS Genet Date: 2014-04-17 Impact factor: 5.917
Authors: Natalia María Bulacio Gil; Hipólito Fernando Pajot; María Del Milagro Rosales Soro; Lucía Inés Castellanos de Figueroa; Daniel Kurth Journal: 3 Biotech Date: 2018-10-05 Impact factor: 2.406