| Literature DB >> 30018880 |
Brenda D Wingfield1, Gerald F Bills2, Yang Dong3,4,5, Wenli Huang6, Wilma J Nel1, Benedicta S Swalarsk-Parry1, Niloofar Vaghefi7, P Markus Wilken1, Zhiqiang An2, Z Wilhelm de Beer1, Lieschen De Vos1, Li Chen2, Tuan A Duong1, Yun Gao8, Almuth Hammerbacher9, Julie R Kikkert10, Yan Li2,11, Huiying Li12, Kuan Li13, Qiang Li6, Xingzhong Liu13, Xiao Ma14, Kershney Naidoo1, Sarah J Pethybridge7, Jingzu Sun11,13, Emma T Steenkamp1, Magriet A van der Nest1, Stephanie van Wyk1, Michael J Wingfield1, Chuan Xiong6, Qun Yue2,15, Xiaoling Zhang13.
Abstract
Draft genomes of the species Annulohypoxylon stygium, Aspergillus mulundensis, Berkeleyomyces basicola (syn. Thielaviopsis basicola), Ceratocystis smalleyi, two Cercospora beticola strains, Coleophoma cylindrospora, Fusarium fracticaudum, Phialophora cf. hyalina and Morchella septimelata are presented. Both mating types (MAT1-1 and MAT1-2) of Cercospora beticola are included. Two strains of Coleophoma cylindrospora that produce sulfated homotyrosine echinocandin variants, FR209602, FR220897 and FR220899 are presented. The sequencing of Aspergillus mulundensis, Coleophoma cylindrospora and Phialophora cf. hyalina has enabled mapping of the gene clusters encoding the chemical diversity from the echinocandin pathways, providing data that reveals the complexity of secondary metabolism in these different species. Overall these genomes provide a valuable resource for understanding the molecular processes underlying pathogenicity (in some cases), biology and toxin production of these economically important fungi.Entities:
Keywords: Beta vulgaris; Carya cordiformis; Pitch canker; echinocandin gene clusters; mulundocandins; peumocandins
Year: 2018 PMID: 30018880 PMCID: PMC6048567 DOI: 10.5598/imafungus.2018.09.01.13
Source DB: PubMed Journal: IMA Fungus ISSN: 2210-6340 Impact factor: 3.515
Whole genome DNA sequence assemblies generated in Annulohypoxylon stygium MG137. The genomes of A. stygium MG137 were generated using next generation sequencing technology.
| Coverage | 31.26 |
| BUSCO | 96.6% |
| Total sequence length (Mb) | 47.5 |
| 1854 | |
| Scaffold N50 | 598 310 |
| GC (%) | 46 |
| Predicted gene models | 12 498 |
|
| |
| Total CAZYmes | 757 |
| Auxiliary activities | 153 |
| Pectate lyases | 13 |
| Glycosyltransferases | 106 |
| Glycoside hydrolases | 297 |
| Carbohydrate esterases | 125 |
| Carbohydrate binding motifs | 63 |
|
| |
| Total SM clusters | 90 |
| Type I polyketide synthetases (PKSs) | 36 |
| Type III PKSs | 1 |
| Nonribosomal peptide synthetases (NRPSs) | 21 |
| Terpene clusters | 10 |
| Hglks | 0 |
Fig. 1.Maximum Likelihood (ML) phylogenetic analysis of the genus Annulohypoxylon and the closely related genus Hypoxylon using MEGA 6.06 based on partial gene sequences of β-tubulin. Bootstrap values were calculated using 1000 replicates to assess node support. Annulohypoxylon stygium isolates used for verification was extracted from the assembled genomes. Reference sequences are obtained from the NCBI database with accession number.
Fig. 2.Some naturally occurring echinocandins described in the patent literature.
General features of the genomes of Coleophoma cylindrospora BP6252 and BP5796, Phialophora cf. hyalina BP5553, and Aspergillus mulundensis DSMZ 5745.
| Assembly size (Mb) | 42.4 | 40.4 | 33.6 | 45 |
| Scaffolds | 77 | 45 | 32 | 160 |
| Scaffold N50 (Mb) | 2.3 | 2 | 3.8 | 2.8 |
| Coverage (fold) | 100 | 100 | 102 | 100 |
| G+C content (%) | 48.7 | 48.5 | 48.2 | 43.2 |
| Protein-coding genes | 14177 | 13257 | 10707 | 11603 |
| Gene density (per Mb) | 337.55 | 331.42 | 324.45 | 257.84 |
| Exons per gene | 3.15 | 3.13 | 3.12 | 3.02 |
| PKSs | 15 | 15 | 19 | 25 |
| NRPSs | 8 | 6 | 13 | 19 |
| PKS-NRPS hybrids | 0 | 0 | 6 | 1 |
| DMATSs | 2 | 2 | 2 | 4 |
| Terpene synthases | 1 | 1 | 4 | 4 |
| Chalcone or stilbene synthase gene | 0 | 0 | 1 | 0 |
| Secondary metabolite gene clusters | 30 | 28 | 48 | 59 |
Fig. 3.Maximum Likelihood tree of ex-type and authentic strains of Aspergillus sect. Nidulantes (25 strains) inferred based on an alignment of the concatenated sequences of the ITS-28S rDNA, ribosomal polymerase II, β-tubulin, and calmodulin genes. Data were resampled from Chen . DMSZ 5745 is labelled in red, and A. unguis was positioned as the outgroup. The Maximum Likelihood tree was based on the Tamura-Nei model. The tree with the highest log likelihood (–13 959.85) is shown. Branches are labelled with the percentage of trees in which the associated taxa clustered together. A discrete gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 0.2371)). Branch lengths were measured in the number of substitutions/site. The dataset included 3329 positions. Data were analyzed in MEGA7 (Kumar ). "Type" = ex-type cultures.
Fig. 4.Maximum Likelihood (ML) phylogram derived from the analyses of the partial MCM7 gene sequences for species in Ceratocystidaceae. CLCbio Genomics Workbench v. 9.5 (CLCbio, QIAGEN, Aarthus, Denmark) was used to screen the genome of B. basicola isolate CMW 49352 to identify and extract the MCM7 gene using an available reference sequence for the gene from B. basicola (Accession: MF967102). A dataset was prepared based on the phylogenies of Nel and sequences were downloaded from NCBI GenBank. DNA sequence alignments of the dataset were done using the online version of MAFFT v. 7 (Katoh & Standley, 2013). The ML analyses were performed in MEGA v. 6.06 (Tamura ) using the GTR model. Values shown at nodes are confidence values >75 %. The sequence from the B. basicola genome is indicated in bold.
Fig. 5.A Maximum Likelihood phylogeny showing Ceratocystidaceae isolates for which published whole-genome sequences are available, including that of C. smalleyi discussed here. The 60S, LSU, and MCM7 gene regions were used, and was either extracted from the assembled genomes or were obtained from the study of Wingfield . Phylogeny constructed using the TrN+I+G model with confidence values based on 1000 bootstrap replicates. Only bootstrap values ≥ 75 are shown.
Fig. 6.Identity verification of Cercospora beticola isolates sequenced in this study. The phylogeny was constructed by Bayesian inference based on the sequences of five loci; ITS, act, cmd, his and tef1-α. Sequence alignments were produced using MAFFT v. 7 (Katoh & Standley 2013) (MrBayes v. 3.1.2; Ronquist & Huelsenbeck, 2003). Branches with posterior probability of 1.00 are thickened. The tree was rooted to C. zeae-maydis (CBS 117757).
Fig. 7.Maximum Likelihood tree of genome-sequenced strains producing echinocandins (red) and selected strains of the Leotiomycetes (55 strains total) based on an alignment of the ITS and 28S rDNA. Botryotinia fuckeliana was positioned as the outgroup. The tree was inferred by using the maximum likelihood method based on the Kimura 2-parameter model. The tree with the highest log likelihood (-4229.10) is shown. The percentage of trees in which the associated taxa clustered together is labelled on branch nodes. A discrete gamma distribution was used to model evolutionary rate differences among sites (5 categories (+G, parameter = 0.4881)). Branch lengths were measured in the number of substitutions/site. All positions containing gaps and missing data were eliminated. The dataset included 955 positions. Data were analyzed in MEGA7 (
Fig. 8.A Maximum Likelihood phylogeny showing the placement of the F. fracticaudum isolate (indicated in bold) that was sequenced in this study. The tree was inferred from combined β-tubulin and translation elongation factor 1-α gene sequences (Herron ). Values at branch nodes are the bootstrapping confidence values with those ≥ 85% shown. The scale bar indicates substitution per site.
Genome statistics for F. fracticaudum and its close relatives.
| Genome size (Mb) | 46.29 | 47.83 | 43.43 | 45.46 |
| GC content (%) | 47.6 | 46.0 | 47.4 | 47.0 |
| Predicted orfs | 14 729 | 14 640 | 15 056 | 14 284 |
| Average gene length (bp) | 1531 | 1472 | 1312 | 1575 |
| Gene density (ORFs/Mb) | 318 | 306 | 347 | 314 |
1Wingfield ;
2Wingfield ;
3Wingfield ;
4Open reading frames.
Fig. 9.A Bayesian inference (BI) phylogenetic analysis of genus Morchella using MrBayes v3.2.6 based on partial gene sequences of elongation factor 1-alpha (EF1-α) gene. Posterior probabilities are shown on the nodes of the tree. The Morchella septimelata isolate used for verification was extracted from the assembled genomes. Reference sequences are obtained from the NCBI database with accession number.
Genome statistics, CAZYme richness and secondary metabolite clusters for the Morchella septimelata MG91 genome sequence.
| Genome | |
|---|---|
| Coverage | 151.17x |
| BUSCO | 97.7% |
| Total sequence length (Mb) | 49.81 |
| Scaffolds | 6 525 |
| Scaffold N50 (bp) | 37 734 |
| GC (%) | 47.40 |
| Predicted gene models | 11 427 |
| Average gene length (bp) | 1 571 |
| Average gene density (genes/Mb) | 229 |
| Total CAZYmes | 512 |
| Auxiliary activities | 75 |
| Pectate lyases | 23 |
| Glycosyl transferases | 75 |
| Glycoside hydrolases | 201 |
| Carbohydrate esterases | 72 |
| Carbohydrate binding motifs | 66 |
| Total SM clusters | 9 |
| Terpene clusters | 3 |
| Type I polyketide synthetases (PKSs) | 1 |
| Nonribosomal peptide synthetases (NRPSs) | 1 |
| Others | 4 |